How to Connect Azure IoT Hub and CrateDB Cloud to Ingest IoT Sensor Data
How to Connect Azure IoT Hub and CrateDB Cloud to Ingest IoT Sensor Data
Get Azure IoT Hub and CrateDB Cloud working on IoT sensor data in these 6 steps.
Join the DZone community and get the full member experience.Join For Free
This article will describe how to launch a CrateDB cluster on Azure, connect it to Azure IoT Hub, and test it by ingesting simulated sensor data using an Azure IoT Solution Accelerator.
Step #1: Simulating Sensor Data
It’s necessary to first understand the type of data your IoT application will produce in order to accurately mimic this information for testing purposes. Smart factories, for example, will have myriad sensors that collect data in a variety of structures. CrateDB makes it possible to model different data structures in a single table through the use of dynamic objects, which can be queried to an arbitrary depth (this is not a recommended practice for production, but is helpful within the simple confines of this demonstration).
Azure offers several IoT Solution Accelerators – preconfigured templates designed to support and expedite IoT projects – including a Device Simulation accelerator capable of simulating a range of IoT devices. Data produced using this method can be pushed to Azure IoT Hub, and then ingested into CrateDB.
To create a new device simulation, go to the New Device Simulation page within the Microsoft Azure IoT solution accelerators site and enter the required information (including a deployment name, Azure subscription, deployment options, and Azure location):
For this example, the device simulation is named CrateDBIngest and uses the Azure subscription crate-development, the deployment option Provision new IoT Hub, and the Azure location West Europe. When you’re ready, clicking the Create button will start the 15-20 minute process of deploying the new device simulation. You’ll receive a notification email when the simulation is available.
You can now view your solution accelerator within the interface:
Next, click the My solutions tab in the upper-right of the screen, and choose the appropriate Launch button.
You’ll now see the solution accelerator interface. Choose New Simulation, and fill out the necessary information:
Our example simulation is named SensorDevice. and is set to stop after 10 minutes. It also uses the Chiller device model, which provides telemetry data from simulated sensors measuring temperature, pressure, and humidity. We set the number of these simulated devices to 10, and the total messages that they send each second to one. We also selected use pre-provisioned IoT Hub, and S2 Standard for the throttling limit.
With the correct settings entered, click the Start simulation button. Azure will begin the simulation and display a screen similar to this:
In our example, the simulation produced 520 total messages over its 10-minute duration. These messages are also known as events, which we’ll push to CrateDB once our setup is complete.
Step #2: Configuring the Azure Portal
Next, visit the Azure Portal to begin configuration. On the Azure Portal screen, use the search bar at the top to search for “Resource Groups”. There should be a new resource group available with the same name as your solution accelerator deployment (if you deployed the solution accelerator to an existing resource group instead, a new group will not have been created). In our example, the resource group CrateDBIngest is available with the same name as the device simulation:
The resource group includes 13 listed resources. For our purposes we need to adjust the configurations of the IoT Hub and Storage account resource types, as displayed in the TYPE column.
Changing the IoT Hub Configuration
The above screenshot shows that the IoT Hub resource in our example has the name iothub-oo6xk. Select your IoT Hub resource, and then choose the Built-in endpoints option under Settings. Under the Consumer Groups section, create a new consumer group: our example uses the name cratedbingest.
Be sure to remember your chosen name and to copy the EventHub-compatible endpoint URL (both outlined in red in the screenshot below), because you will need these in a later step.
Now, choose Message routing from the menu to add a new route. Give the route a name (for our example we’ll again use cratedbingest). Next, choose the endpoint events, and set the data source as Device Telemetry Message. Save your changes to complete your IoT Hub configuration.
Changing the Storage Account Configuration
Return to the resource group, and create a new storage account, using the settings in the screenshot below:
It’s also an option to use the storage account created by the solution accelerator. However, if you were ever to delete the solution accelerator you would need to then create a new storage account for other use cases. For that reason, it may make sense to create a separate storage account for data in the first place.
We’ll use blob storage in order to “checkpoint” data from the events queue. With your storage account in place, create a blob storage. Next, under Settings, select Access keys and copy both the storage account name and the blob storage connection string (outlined in red in the screenshot below).
Step #3: Setting up CrateDB Cloud
To move forward in successfully setting up CrateDB Cloud, you should have your device simulator set up on Azure, and these key pieces of information at hand:
- Your Azure consumer group name
- The consumer group EventHub-compatible endpoint URL
- Your storage account name (the one you’re using for Azure blob storage)
- The blob storage connection URL
To begin interacting with CrateDB Cloud, you can use the Croud command-line interface tool. Install Croud by using the following command:
$ pip install croud
And then to log in:
$ croud login
A browser window will open, which you can use to log into your CrateDB account (you can also create a new account if necessary using the Sign up link):
CrateDB Cloud accounts are organized as follows:
Structurally, all CrateDB Cloud accounts belong to an organization. Each organization can have multiple projects. Each project can have multiple CrateDB clusters, and each of those clusters can have multiple event customers.
Step #4: Deploying a CrateDB Cluster
First, create a new organization with this command:
You can now create your first project. In the next command, be sure to replace ORG_ID with your own organization ID (provided in the output table of the command above).
The output table for this command now returns the project ID (in our example it’s d24b6665-9719-42e8-9876-9b7f300dd159).
Next you can deploy a CrateDB cluster. Be sure to replace PROJECT_ID with your own project ID, and USERNAME and PASSWORD with your CrateDB admin UI username and password:
Going into detail on this last set of commands: our example uses crated.az-gp1 as the product, at the “extra small” xs tier (other options are available). At the same time, cluster sizes are measured in units: one unit comprises three nodes. This command deploys CrateDB version 3.3.3 in one unit (a three-node cluster) with the name cratedbingest. The output table for a successful cluster deployment command will display the ID, Name, fully qualified domain name (FQDN), and URL, as is true in the above example.
While the command will return the output table right away, CrateDB Cloud will take a few minutes to complete deployment of the cluster. To verify if a cluster is available, use a browser to visit the cluster URL. The cluster URL allows you to access the cluster’s admin UI (once it’s running). You can log into the admin UI with the username and password you choose when creating the cluster. When you log in you’ll see an interface similar to this screenshot:
Step #5: Creating a Sensor Events Table
In order for CrateDB to consume the simulated events generated earlier, you need to create a table ready to hold that sensor data.
Using the CrateDB UI, select Console from the menu. Enter this into the console:
This command instructs that sensor data should be modeled as a dynamic object, named payload, which is able to handle arbitrarily structured sensor readings.
The command also generates an event timestamp, and a timestamp based on the beginning of the current week, which makes it possible to partition the table by week. This partitioning is valuable in tremendously increasing the speed of date-ranged queries, by reducing the total number of records to process.
To create this table, select EXECUTE QUERY.
Step #6: Deploying a CrateDB Event Consumer
With a table prepared to record sensor events in place, you’re ready to subscribe your Device Simulation IoT Hub to use a CrateDB event consumer to receive events.
Deploy an event consumer using the following command. Be sure to:
- Replace PROJECT_ID and CLUSTER_ID with the correct project ID and cluster ID.
- Replace EVENTHUB_DSN with the EventHub-compatible connection string, and CONSUMER_GROUP with the name of the consumer group, which you used in configuring the Azure IoT Hub endpoint.
- Replace STORAGE_DSN with the correct blob storage connection URL.
- Replace STORAGE_CONTAINER with your Azure storage account name.
This command uses the eventhub-consumer product at the xs (extra small) tier, and deploys a consumer with the name cratedbingest, to write to the raw table, in the doc schema.
A successful command will begin to send data from the Azure device simulator into CrateDB. To verify this flow of data, use the menu to navigate to the Tables screen within the CrateDB admin UI (as seen in the following screenshot):
In our example, this page verifies that the raw table contains 560 records.
Next, select QUERY TABLE to view the simulated data itself:
You can now view the data on humidity, temperature, and pressure produced by the simulated IoT sensors.
Using these techniques, you can use Microsoft Azure to mimic realistic IoT sensor data and produce events, then deploy a CrateDB cluster, and use the Azure IoT Hub to subscribe your cluster to ingest the event stream.
Opinions expressed by DZone contributors are their own.