Introduction to Data Integration With Ballerina

DZone 's Guide to

Introduction to Data Integration With Ballerina

Learn about data integration and the benefits of the Ballerina integration language, with a powerful type system and support for various data sources.

· Integration Zone ·
Free Resource

What Is Ballerina?

Ballerina is an open source, concurrent programming language which has both textual and graphical representation. It is mainly designed for seamless integration of networked applications. Ballerina is strongly and statically typed with a powerful type system.

What Is Data Integration?

The importance of data integration is apparent to anyone who’s spent time fetching information from multiple systems for a basic report. Data integration involves combining data from several disparate sources, which are stored using various technologies and provide a unified view of the data. Today data integration becomes increasingly important and integral part of any kind of business/process integration scenario. Data integration is a term covering several distinct sub-areas such as:

  • Data warehousing - aggregating structured data from one or more sources so that it can be compared and analyzed for greater business intelligence.

  • Data migration/Transformation (ETL) - transferring data from one system to another while changing the format, storage, database or application.

  • Enterprise application/information integration - establishing consistency among systems and provide unified view of data from different sources.

  • Master data management - consistently manage the non-transactional data.

Why Use Ballerina for Data Integration?

Ballerina Type System

In most of the traditional languages, SQL result sets, JSON data, and XML data are not treated as first-class types. When using or manipulating these data, we have to use external libraries or add-ons to get the work done. But ballerina was designed with a sophisticated type system with first-class data type support for different data types and formats. Users can generate, manipulate, and convert from one type to another easily and with fewer code lines. Following are the basic data types Ballerina is capable of handling.

  • Value types - int, float, string, boolean, blob

  • datatable - Represents tabular data in Ballerina. (Ex: Data in a Resultset returned from a SQL query)

  • JSON - builtin type to represent JSON data

  • XML - builtin type to represent XML data

  • struct - allows to define user-defined types

  • array - array of data

  • Map - key-value pairs

The datatable, JSON, XML, and struct types are highly useful when working with data integration scenarios. In Ballerina, datatable can be directly converted into XML or JSON, and can be mapped into struct types where each row of tabular data is mapped into a struct. Also JSON, map, and struct are interoperable types and casting/converting allows transformation between different types easily.

Connector Support for Various Data Sources

Ballerina connectors are used to connect with external entities or APIs. For data integration purposes ballerina provides several SQL and NoSQL connectors to interact with tabular and NoSQL data sources. Ballerina is equipped with following data connectors as of now (Ballerina 0.95.2 version) Ballerina provide extension mechanisms for writing custom native/ballerina connectors which can connect to any custom data sources if required.

  • SQLConnector - Built-in connector which connects with SQL based tabular data sources.

  • MongoDB Connector - Connects to MongoDB to allow data find operations and manipulation operations like update, delete, etc.

  • Cassandra Connector - Ballerina Cassandra Connector is used to connect Ballerina with Cassandra data sources and update select data.

Built-in Transaction Support

Ballerina transaction is a series of data manipulation statements that must either fully complete or fully fail, leaving the system in a consistent state. Ballerina language supports both local and distributed transactions for data and JMS connector actions. Ballerina provides syntax support for defining transaction boundaries and handling transaction failures and retries.

Data Transformation Capabilities

The transformer syntax in ballerina is useful for having custom transformation between different types such as structs and JSON. Together with the data casting/conversion functionality, this becomes a key part of data integration scenarios.

Data Streaming Support

In Ballerina, datatable to JSON and datatable to XML conversion results in streamed data. With the data streaming functionality, when a service client makes a request, the result is streamed to the service client rather than building the full result in the server and returning it. This allows virtually unlimited payload sizes in the result, and the response is instantaneous to the client. There the result set corresponding to a particular query is converted to XML/JSON row by row and written to the wire as the conversion takes place upon a row.

Graphical Data Modeling With Composer

Composer is a tool to edit Ballerina programs both in graphically and textually. The visual representation of Ballerina is based on a sequence diagram model, and it helps the developer to have a clear view of the entire data integration flow.

Ability to Expose Data as Services via HTTP Service

The success of a business lies in its ability to integrate its data from across the organization and analyze it to make more informed decisions, so accessing data in a convenient way is a key requirement in any data integration scenario. APIs make this data exposure possible, and REST is one of the most popular APIs to communicate with the web, mobile, and cloud apps. The rich, fast, and easy HTTP REST service development support in Ballerina allows rapid data services exposure via REST APIs.

ballerina, data, data integration, integration

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}