DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • A Comparative Analysis: AWS Kinesis vs Amazon Managed Streaming for Kafka - MSK
  • AWS S3 Strategies for Scalable and Secure Data Lake Storage
  • Attribute-Level Governance Using Apache Iceberg Tables
  • Processing Cloud Data With DuckDB And AWS S3

Trending

  • Breaking Bottlenecks: Applying the Theory of Constraints to Software Development
  • A Developer's Guide to Mastering Agentic AI: From Theory to Practice
  • Medallion Architecture: Why You Need It and How To Implement It With ClickHouse
  • Docker Model Runner: Streamlining AI Deployment for Developers
  1. DZone
  2. Data Engineering
  3. Data
  4. Ingesting IoT Sensor Data Into S3 With an RPI3

Ingesting IoT Sensor Data Into S3 With an RPI3

StreamSets Data Collector Edge is a lightweight agent used to create end-to-end data flow pipelines. We'll use it help stream data collected from a sensor.

By 
Rathnadevi Manivannan user avatar
Rathnadevi Manivannan
·
Dec. 30, 17 · Tutorial
Likes (4)
Comment
Save
Tweet
Share
10.5K Views

Join the DZone community and get the full member experience.

Join For Free

Due to the increasing amount of data produced from outside source systems, enterprises are facing difficulties in reading, collecting, and ingesting data into a desired, central database system. An edge pipeline runs on an edge device with limited resources, receives data from another pipeline or reads the data from the device, and controls the device based on the data.

StreamSets Data Collector (SDC) Edge, an ultra-lightweight agent, is used to create end-to-end data flow pipelines in StreamSets Data Collector and to run the pipelines to read and export data in and out of systems. In this blog, StreamSets Data Collector Edge is used to read data from an air pressure sensor (BMP180) from an IoT device (Raspberry Pi3). Meanwhile, StreamSets Data Collector is used to load the data into Amazon's Simple Storage Service (S3) via MQTT.

Prerequisites

  • Install StreamSets
  • Raspberry Pi3
  • BMP180 Sensor
  • Amazon S3 Storage

Use Case

  • Read an air pressure sensor data with an IoT Device (Raspberry Pi3) and send data via MQTT
  • Use SDC to load the data into Amazon S3 via MQTT

Synopsis:

  • Connect the BMP180 temperature/pressure sensor with your Raspberry Pi3
  • Create an edge sending pipeline
  • Create a data collector receiving pipeline

Flow diagram:select

Connecting BMP180 Temperature/Pressure Sensor With a Raspberry Pi3

I2C bus, a communication protocol, is used by Raspberry Pi3 to communicate with other embedded IoT devices such as temperature sensors, displays, accelerometers, and so on. The I2C bus has two wires called SCL and SDA. SCL is a clock line to synchronize all data transfers over the I2C bus, and the SDA is a data line. The devices are connected to the I2C bus via the SCL and SDA lines.

To enable I2C drivers on the Raspberry Pi3, perform the following:

  • Run sudo raspi-config.
  • Choose Interfacing Options from the menu as shown in the following image:

select








  • Choose I2C as shown in the image below:

select

Note: If I2C is not available in the Interfacing Options, check Advanced Options for I2C availability.

  • Click Yes to enable the I2C driver.
  • Click Yes again to load the driver by default.
  • Add i2c-dev to /etc/modules using the following commands:
pi@raspberrypi:~$ sudo nano /etc/modules
i2c-bcm2708
i2c-dev
  • Install i2c-tools using the following command:
pi@raspberrypi:~$ sudo apt-get install python-smbus i2c-tools
  • Reboot the Raspberry Pi3 by using the following command:
sudo reboot
  • Ensure that the I2C modules are loaded and made active using the following command:
pi@raspberrypi:~$ lsmod | grep i2c
  • Connect the Raspberry Pi3 with the BMP180 temperature/pressure sensor as shown in the diagram below:

select

  • Ensure that the hardware and software are working fine with i2cdetect using the following command:
pi@raspberrypi:~$ sudo i2cdetect -y 1

select

Building Edge Sending Pipeline

To build an edge sending pipeline to read the sensor data, perform the following:

  • Create an SDC Edge Sending pipeline on StreamSets Data Collector.
  • Read the data directly from the device (using I2C Address) using “Sensor Reader" component.
  • Set the I2C address as “0x77”.
  • Use an Expression Evaluator to convert the temperature from Celsius to Fahrenheit.
  • Publish data to the MQTT topic as “bmp_sensor/data”.
  • Download and move the SDC Edge pipeline's executable format (Linux) to device side, where the pipeline runs on the device side (Raspberry Pi3).
  • Start SDC Edge from the SDC Edge home directory on the edge device using the following command:
bin/edge –start=<pipeline_id>


For example:

bin/edge --start=sendingpipeline137e204d-1970-48a3-b449-d28e68e5220e

select

Building Data Collector Receiving Pipeline

To build a data collector receiving pipeline for storing the received data in Amazon S3, perform the following:

  • Create a receiving pipeline on the StreamSets Data Collector.
  • Use the MQTT subscriber component to consume data from the MQTT topic (bmp_sensor/data).
  • Use the Amazon S3 destination component to load the data into Amazon S3.
  • Run the receiving pipeline in the StreamSets Data Collector.

selectThe real-time air pressure data collected and stored is shown in the image below.select

AWS Data (computing) IoT

Published at DZone with permission of Rathnadevi Manivannan. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • A Comparative Analysis: AWS Kinesis vs Amazon Managed Streaming for Kafka - MSK
  • AWS S3 Strategies for Scalable and Secure Data Lake Storage
  • Attribute-Level Governance Using Apache Iceberg Tables
  • Processing Cloud Data With DuckDB And AWS S3

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!