{{announcement.body}}
{{announcement.title}}

Serverless Data Processing Using Azure Tools

DZone 's Guide to

Serverless Data Processing Using Azure Tools

In this blog, we will see it in action using an example. See how to combine real-time data ingestion component with a Serverless processing layer.

· Cloud Zone ·
Free Resource

One of the previous blogs covered some of the concepts behind how Azure Event Hubs supports multiple protocols for data exchange. In this blog, we will see it in action using an example. With the help of a sample app, you will see how to combine real-time data ingestion component with a Serverless processing layer.

The sample application has the following components:

To follow along and deploy this solution to Azure, you are going to need a Microsoft Azure account. You can grab one for free if you don't have it already!

Application Components

Let's go through the individual components of the applications

As always, the code is available on GitHub.

Producer Component

This is pretty straightforward - it is a Go app which uses the Sarama Kafka client to send (simulated) "orders" to Azure Event Hubs (Kafka topic). It is available in the form of a Docker image for ease of use (details in next section)

Here is the relevant code snippet:

JSON
 




x


 
1
order := Order{OrderID: "order-1234", CustomerID: "customer-1234", Product: "product-1234"}
2
 
          
3
b, err := json.Marshal(order)
4
 
          
5
msg := &sarama.ProducerMessage{Topic: eventHubsTopic, Key: sarama.StringEncoder(oid), Value: sarama.ByteEncoder(b)}
6
producer.SendMessage(msg)



A lot of the details have been omitted (from the above snippet) - you can grok through the full code here. To summarize, an Order is created, converted (marshaled) into JSON (bytes) and sent to Event Hubs Kafka endpoint.

Serverless Component

The Serverless part is a Java Azure Function. It leverages the following capabilities:

The Trigger allows the Azure Functions logic to get invoked whenever an order event is sent to Azure Event Hubs. The Output Binding takes care of all the heavy lifting such as establishing database connection, scaling, concurrency, etc. and all that's left for us to build is the business logic, which in this case has been kept pretty simple - on receiving the order data from Azure Event Hubs, the function enriches it with additional info (customer and product name in this case), and persists it in an Azure Cosmos DB container.

You can check the OrderProcessor code on Github, but here is the gist:

Java
 




xxxxxxxxxx
1
20


 
1
@FunctionName("storeOrders")
2
public void storeOrders(
3
 
          
4
  @EventHubTrigger(name = "orders", eventHubName = "", connection = 
5
  "EventHubConnectionString", cardinality = Cardinality.ONE) 
6
  OrderEvent orderEvent,
7
 
          
8
  @CosmosDBOutput(name = "databaseOutput", databaseName = "AppStore", 
9
  collectionName = "orders", connectionStringSetting = 
10
  "CosmosDBConnectionString") 
11
  OutputBinding<Order> output,
12
 
          
13
  final ExecutionContext context) {
14
....
15
 
          
16
Order order = new Order(orderEvent.getOrderId(),Data.CUSTOMER_DATA.get(orderEvent.getCustomerId()), orderEvent.getCustomerId(),Data.PRODUCT_DATA.get(orderEvent.getProduct());
17
output.setValue(order);
18
 
          
19
....
20
}



The storeOrders method is annotated with @FunctionName and it receives data from Event Hubs in the form of an OrderEvent object. Thanks to the @EventHubTrigger annotation, the platform that takes care of converting the Event Hub payload to a Java POJO (of the type OrderEvent) and routing it correctly. The connection = "EventHubConnectionString" part specifies that the Event Hubs connection string is available in the function configuration/settings named EventHubConnectionString

The @CosmosDBOutput annotation is used to persist data in Azure Cosmos DB. It contains the Cosmos DB database and container name, along with the connection string which will be picked up from the CosmosDBConnectionString configuration parameter in the function. The POJO (Order in this case) is persisted to Cosmos DB with a single setValue method call on the OutputBinding object - the platform makes it really easy, but there is a lot going on behind the scenes!

Let's switch gears and learn how to deploy the solution to Azure

Pre-Requisites

Notes:

Deploy the Order Processor Function

This example makes use of the Azure Functions Maven plugin for deployment. First, update the pom.xml to add the required configuration.

Replace <appSettings> section and replace values for AzureWebJobsStorage, EventHubConnectionString and CosmosDBConnectionString parameters

Use the Azure CLI to easily fetch the required details

For the configuration section, update the following:

  • resourceGroup: the resource group to which you want to deploy the function to
  • region: Azure region to which you want to deploy the function to (get the list of locations)

To deploy, you need two commands:

  • mvn clean package - prepare the deployment artifact
  • mvn azure-functions:deploy - deploy to Azure

You can confirm using Azure CLI az functionapp list --query "[?name=='orders-processor']" or the portal

Run Event Hubs Producer

Set environment variables:

Java
 




xxxxxxxxxx
1


 
1
export EVENTHUBS_BROKER=<namespace>.servicebus.windows.net:9093
2
export EVENTHUBS_TOPIC=<event-hub-name>
3
export EVENTHUBS_CONNECTION_STRING="Endpoint=sb://<namespace>.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=<primary_key>"


Run the Docker image

docker run -e EVENTHUBS_BROKER=$EVENTHUBS_BROKER -e EVENTHUBS_TOPIC=

$EVENTHUBS_TOPIC -e EVENTHUBS_CONNECTION_STRING=$EVENTHUBS_CONNECTION_STRING

abhirockzz/eventhubs-kafka-producer

Press ctrl+c to stop producing events.

Confirm the Results in Azure Cosmos DB

You can use the Azure Cosmos DB data explorer (web interface) to check the items in the container. You should see results similar to this:

orderID

Clean Up

Assuming you placed all the services in the same resource group, you can delete them using a single command:

export RESOURCE_GROUP_NAME=<enter the name>
az group delete --name $RESOURCE_GROUP_NAME --no-wait

Thanks for reading! 

Happy to get your feedback via Twitter or just drop a comment. Stay tuned for more!

Topics:
azure, azure cosmos db, azure functions, cloud, databases, messaging, serverless, tutorial

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}