DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
View Events Video Library
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

UK-US Data Bridge: Join TechnologyAdvice and OneTrust as they discuss the UK extension to the EU-US Data Privacy Framework (DPF).

Migrate, Modernize and Build Java Web Apps on Azure: This live workshop will cover methods to enhance Java application development workflow.

Kubernetes in the Enterprise: The latest expert insights on scaling, serverless, Kubernetes-powered AI, cluster security, FinOps, and more.

A Guide to Continuous Integration and Deployment: Learn the fundamentals and understand the use of CI/CD in your apps.

IoT

IoT, or the Internet of Things, is a technological field that makes it possible for users to connect devices and systems and exchange data over the internet. Through DZone's IoT resources, you'll learn about smart devices, sensors, networks, edge computing, and many other technologies — including those that are now part of the average person's daily life.

icon
Latest Refcards and Trend Reports
Trend Report
Edge Computing and IoT
Edge Computing and IoT
Refcard #214
MQTT Essentials
MQTT Essentials
Refcard #263
Messaging and Data Infrastructure for IoT
Messaging and Data Infrastructure for IoT

DZone's Featured IoT Resources

Continuous Integration and Continuous Deployment (CI/CD) for AI-Enabled IoT Systems

Continuous Integration and Continuous Deployment (CI/CD) for AI-Enabled IoT Systems

By Deep Manishkumar Dave
In today's fast-evolving technology landscape, the integration of Artificial Intelligence (AI) into Internet of Things (IoT) systems has become increasingly prevalent. AI-enhanced IoT systems have the potential to revolutionize industries such as healthcare, manufacturing, and smart cities. However, deploying and maintaining these systems can be challenging due to the complexity of the AI models and the need for seamless updates and deployments. This article is tailored for software engineers and explores best practices for implementing Continuous Integration and Continuous Deployment (CI/CD) pipelines for AI-enabled IoT systems, ensuring smooth and efficient operations. Introduction To CI/CD in IoT Systems CI/CD is a software development practice that emphasizes the automated building, testing, and deployment of code changes. While CI/CD has traditionally been associated with web and mobile applications, its principles can be effectively applied to AI-enabled IoT systems. These systems often consist of multiple components, including edge devices, cloud services, and AI models, making CI/CD essential for maintaining reliability and agility. Challenges in AI-Enabled IoT Deployments AI-enabled IoT systems face several unique challenges: Resource Constraints: IoT edge devices often have limited computational resources, making it challenging to deploy resource-intensive AI models. Data Management: IoT systems generate massive amounts of data, and managing this data efficiently is crucial for AI model training and deployment. Model Updates: AI models require periodic updates to improve accuracy or adapt to changing conditions. Deploying these updates seamlessly to edge devices is challenging. Latency Requirements: Some IoT applications demand low-latency processing, necessitating efficient model inference at the edge. Best Practices for CI/CD in AI-Enabled IoT Systems Version Control: Implement version control for all components of your IoT system, including AI models, firmware, and cloud services. Use tools like Git to track changes and collaborate effectively. Create separate repositories for each component, allowing for independent development and testing. Automated Testing: Implement a comprehensive automated testing strategy that covers all aspects of your IoT system. This includes unit tests for firmware, integration tests for AI models, and end-to-end tests for the entire system. Automation ensures that regressions are caught early in the development process. Containerization: Use containerization technologies like Docker to package AI models and application code. Containers provide a consistent environment for deployment across various edge devices and cloud services, simplifying the deployment process. Orchestration: Leverage container orchestration tools like Kubernetes to manage the deployment and scaling of containers across edge devices and cloud infrastructure. Kubernetes ensures high availability and efficient resource utilization. Continuous Integration for AI Models: Set up CI pipelines specifically for AI models. Automate model training, evaluation, and validation. This ensures that updated models are thoroughly tested before deployment, reducing the risk of model-related issues. Edge Device Simulation: Simulate edge devices in your CI/CD environment to validate deployments at scale. This allows you to identify potential issues related to device heterogeneity and resource constraints early in the development cycle. Edge Device Management: Implement device management solutions that facilitate over-the-air (OTA) updates. These solutions should enable remote deployment of firmware updates and AI model updates to edge devices securely and efficiently. Monitoring and Telemetry: Incorporate comprehensive monitoring and telemetry into your IoT system. Use tools like Prometheus and Grafana to collect and visualize performance metrics from edge devices, AI models, and cloud services. This helps detect issues and optimize system performance. Rollback Strategies: Prepare rollback strategies in case a deployment introduces critical issues. Automate the rollback process to quickly revert to a stable version in case of failures, minimizing downtime. Security: Security is paramount in IoT systems. Implement security best practices, including encryption, authentication, and access control, at both the device and cloud levels. Regularly update and patch security vulnerabilities. CI/CD Workflow for AI-Enabled IoT Systems Let's illustrate a CI/CD workflow for AI-enabled IoT systems: Version Control: Developers commit changes to their respective repositories for firmware, AI models, and cloud services. Automated Testing: Automated tests are triggered upon code commits. Unit tests, integration tests, and end-to-end tests are executed to ensure code quality. Containerization: AI models and firmware are containerized using Docker, ensuring consistency across edge devices. Continuous Integration for AI Models: AI models undergo automated training and evaluation. Models that pass predefined criteria are considered for deployment. Device Simulation: Simulated edge devices are used to validate the deployment of containerized applications and AI models. Orchestration: Kubernetes orchestrates the deployment of containers to edge devices and cloud infrastructure based on predefined scaling rules. Monitoring and Telemetry: Performance metrics, logs, and telemetry data are continuously collected and analyzed to identify issues and optimize system performance. Rollback: In case of deployment failures or issues, an automated rollback process is triggered to revert to the previous stable version. Security: Security measures, such as encryption, authentication, and access control, are enforced throughout the system. Case Study: Smart Surveillance System Consider a smart surveillance system that uses AI-enabled cameras for real-time object detection in a smart city. Here's how CI/CD principles can be applied: Version Control: Separate repositories for camera firmware, AI models, and cloud services enable independent development and versioning. Automated Testing: Automated tests ensure that camera firmware, AI models, and cloud services are thoroughly tested before deployment. Containerization: Docker containers package the camera firmware and AI models, allowing for consistent deployment across various camera models. Continuous Integration for AI Models: CI pipelines automate AI model training and evaluation. Models meeting accuracy thresholds are considered for deployment. Device Simulation: Simulated camera devices validate the deployment of containers and models at scale. Orchestration: Kubernetes manages container deployment on cameras and cloud servers, ensuring high availability and efficient resource utilization. Monitoring and Telemetry: Metrics on camera performance, model accuracy, and system health are continuously collected and analyzed. Rollback: Automated rollback mechanisms quickly revert to the previous firmware and model versions in case of deployment issues. Security: Strong encryption and authentication mechanisms protect camera data and communication with the cloud. Conclusion Implementing CI/CD pipelines for AI-enabled IoT systems is essential for ensuring the reliability, scalability, and agility of these complex systems. Software engineers must embrace version control, automated testing, containerization, and orchestration to streamline development and deployment processes. Continuous monitoring, rollback strategies, and robust security measures are critical for maintaining the integrity and security of AI-enabled IoT systems. By adopting these best practices, software engineers can confidently deliver AI-powered IoT solutions that drive innovation across various industries. More
A Complete Guide To IoT Development Boards

A Complete Guide To IoT Development Boards

By Carsten Rhod Gregersen
IoT developers know how important it is to find not just what works but what works best for a particular IoT use case. After all, a smart proximity sensor in an IoT security system is very different from a smartwatch you might use to monitor your health, take calls, send messages, and more. The same goes for development boards, which are a central part of any IoT device. Each one comes with its own set of specifications and pros and cons. Learn how to find the best boards for your specific project and what the differences are between them in this guide. What Are IoT Development Boards? Any IoT product will likely have gone through various different iterations before it finally hits the shelves. Developers will make changes like adding or removing features, altering the user interface based on consumer feedback, etc. Unlike a market-ready IoT device, in which the circuit board and electronics inside are hidden by the outer plastic, a development board has everything you need to configure an IoT device exposed for easy access so developers can test out different settings and make changes on the fly. Development boards are just printed circuit boards with either a microcontroller or a microprocessor mounted on them and with various metal pins that an experienced developer can use to develop, prototype, and iterate different versions of various smart products. They’re also used by hobbyists to help them become more familiar with development concepts and hardware. IoT developers can also use the boards to develop a Minimum Viable Product (MVP). The purpose of an MVP is to attract early adopters, show the potential of a final product design, and gain both funding and interest. The MVP stage involves making changes based on initial user feedback, so it’s especially important to have full access to the hardware. Finally, a development board provides the perfect staging ground for trying out a new software product, app, or IoT platform prior to rolling it out on an entire line of smart products. What Are the Different IoT Development Board Categories? The major categories within IoT development boards are Field Programmable Gate Arrays (FPGA), Application-Specific Integrated Circuits (ASIC), and Single Board Computers (SBC). Let’s look at each in turn. You can think of an FPGA as a book that hasn’t been written yet. It doesn’t have a specific purpose, and it’s up to you as the writer to decide what its purpose will be. FPGAs lead all of the configuration up to you. The “field-programmable” part also means that you can “re-write” the purpose of the device at any time. That makes it best suited for the prototyping stages in which you might need to make changes on the fly or develop new functionalities for the device. FPGAs are best used by experienced developers since they don’t come with the hardware pre-configured. If you want to use an FPGA development board properly, you’ll have to learn a Hardware Development Language (HDL). Learning an HDL has been compared to learning complex digital programming languages like C. HDLs help the developer know how the device will respond to various configurations. An ASIC development board, unlike an FPGA, comes pre-programmed with a specific purpose or purpose in mind. It leaves a lot less work to the developer. However, an ASIC board isn’t always best for prototyping because you can’t make major changes to the configuration. Finally, an SBC development board is a full-fledged computer that comes on a single board. The Raspberry Pi might be the best-known example. An SBC has a lot of processing power. While it’s nowhere near as customizable as an FPGA, it will generally have enough different input and output (I/O) options to apply to many different IoT use cases. However, SBCs also require a lot more energy than other development board options. And although their physical size is relatively small, they’re still far too large for, say, a simple smart temperature sensor. Important Features of Development Boards FPGA, ASIC, and SBC development boards have everything you need for an IoT device system in the form of a processor, an operating system, input and output options, and more. But they will differ in what type of processor they use. An FPGA might use either a microprocessor (MPU) or a microcontroller (MCU), whereas an ASIC will generally use one or more MCUs. An SBC, on the other hand, generally relies on an MPU. The reason for these differences is that a microprocessor is higher-powered and suitable for performing multiple tasks rather than just one. It has a full operating system, generally Linux-based, and tends to be higher-cost than an MCU as a result. By contrast, a microcontroller is a low-powered device that’s better suited for a single task that an ASIC might perform. MCUs rely on something called task switching to provide a real-time response to input. Task switching means the MCU will decide which task to perform first based on priority rather than trying to perform multiple tasks at the same time. Since the MCU can focus all of its processing power on one task, that task gets finished very quickly. So, in manufacturing, where a machine might need to turn off within a millisecond of reaching a dangerous temperature in order to avoid a fire, an MCU often provides the best performance. For these reasons, you always need to know whether the development board you plan to buy has an MCU or an MPU. Here are some other features to look out for as well: Open-Source Hardware (OSHW) OSHW is just what it sounds like. Just as there are open-source software designs that allow you to see and freely change the code at will, there are open-source hardware development boards that let you freely use and change the configuration as needed. Arduino development boards are one example. Available Ports When you start to develop a new IoT device, determine what kinds of input and output ports you will need for your specific application. For example, do you want to allow USB connectivity? Do you need an HDMI port for video streaming? Operating System An SBC will generally use Linux, but what about an ASIC development board? ASICs will often rely on a real-time operating system (RTOS). An RTOS is a simplified operating system that performs a single task at a time and switches between tasks. There are also ASICs that have a tick scheduler rather than an OS. A tick scheduler just repeats a single task at certain preset intervals. Connectivity Options These days, one of the most popular features of any IoT device is mobile connectivity. In other words, most consumers want the ability to view the status of their IoT devices remotely and control those devices right from a smartphone. For those purposes, you should examine whether the development board has Bluetooth, wireless internet, 4G, or 5G connectivity capabilities. Random Access Memory If you work with computers much, you’re already familiar with the concept of RAM. While storage memory refers to how much data you can keep on your device, RAM refers to temporary memory that’s used to perform tasks at any given time. So, more memory means the device can perform more tasks or just one task with more processing power. Your cost will go up as RAM increases, so for simple tasks, less RAM is often actually better. Processor Choosing the best processor is far too complicated to go into detail here, but it’s enough to know that three of the top processor options are ARM, Intel x86, and AMD. Intel x86 processors are used in most personal computers for their speed and processing power, but they use more energy as a result. AMDs have even more processing power on average, whereas ARM processors have simpler instruction sets and require less power. ARM processors also tend to be physically smaller, meaning they fit better in size-constrained IoT devices. So, in low-power environments, ARM is often the best option. Examples of IoT Development Boards Some of the top IoT development boards are from the Espressif ESP32 family and the Renesas RA MCU series. Other top providers include NXP, Texas Instruments, STMicro, and Microchip, among others. ESP32 development boards can come in various sizes and levels of processing power. In general, they are low-cost, MCU-based, and low-power, which is ideal for energy-constrained applications. They support Wi-Fi and Bluetooth and have various I/Os to allow you to configure or customize the development boards as needed. Many (but not all) boards in the ESP32 family also have USB connectivity. Renesas is another top provider of IoT development boards, especially of MCU boards. The RA MCU series are general-purpose, so they can be used for many different applications and configured as necessary. RA MCUs tend to be easy to use and debug and have quick start options to let developers get started immediately. Different options within the RA series will include different input and connectivity features, so you’ll have to look at each individually to determine the best board for your use case. There’s also the STM32 family from STMicroelectronics. These are also MCU boards and come with a huge range of features and options, from high-performance and higher-energy versions to ultra-low power options and long-range wireless MCUs. They can come in both single and dual-core versions and are commonly used in consumer IoT devices. STM32 Nucleo boards are commonly used for prototyping and are very developer-friendly and compatible with a variety of development environments and platforms. Arduino is also a popular option for boards with the lowest power consumption and simplest designs. For example, the Arduino Uno is ideal for less experienced developers, and the Arduino Cloud platform makes it easy to create a connected device in minutes. There’s also the Arduino Nano for the smallest IoT devices, which is a cheap, low-power option that offers Bluetooth connectivity. Additionally, Arduino offers its own development environment. Lastly, there are the C2000 and MSP430 MCU boards from Texas Instruments. Texas Instruments provides cost-efficient options for development boards with a huge range of memory and peripheral choices. C2000 boards are specifically designed for real-time capabilities, so they are highly suitable for industrial IoT applications and are made for performance and speed. There are over 2000 different MSP430 devices, and these are designed for quick time-to-market, easy development, flexibility, and cost efficiency. Texas Instruments also offers starter kits, comprehensive documentation, and software resources. Wrapping Up A development board is normally the starting ground for any new IoT project. It allows you to make changes, adjust to feedback, and produce the best possible device for your particular use case. Now you can go in fully informed and ready to find and test out the ideal development board for your project. More
Modernizing SCADA Systems and OT/IT Integration With Apache Kafka
Modernizing SCADA Systems and OT/IT Integration With Apache Kafka
By Kai Wähner DZone Core CORE
Cloud Computing and Wearable Devices: A Powerful Combination
Cloud Computing and Wearable Devices: A Powerful Combination
By Ayodele Johnson
Real-Time Remediation Solutions in Device Management Using Azure IoT Hub
Real-Time Remediation Solutions in Device Management Using Azure IoT Hub
By Nagendran Sathananda Manidas
Smart IoT Integrations With Akenza and Python
Smart IoT Integrations With Akenza and Python

The Akenza IoT platform, on its own, excels in collecting and managing data from a myriad of IoT devices. However, it is integrations with other systems, such as enterprise resource planning (ERP), customer relationship management (CRM) platforms, workflow management or environmental monitoring tools that enable a complete view of the entire organizational landscape. Complementing Akenza's capabilities, and enabling smooth integrations, is the versatility of Python programming. Given how flexible Python is, the language is a natural choice when looking for a bridge between Akenza and the unique requirements of an organization looking to connect its intelligent infrastructure. This article is about combining the two, Akenza and Python. At the end of it, you will have: A bi-directional connection to Akenza using Python and WebSockets. A Python service subscribed to and receiving events from IoT devices through Akenza. A Python service that will be sending data to IoT devices through Akenza. Since WebSocket connections are persistent, their usage enhances the responsiveness of IoT applications, which in turn helps to exchange occur in real-time, thus fostering a dynamic and agile integrated ecosystem. Python and Akenza WebSocket Connections First, let's have a look at the full Python code, which to be discussed later. Python # -*- coding: utf-8 -*- # Zato from zato.server.service import WSXAdapter # ############################################################################################### # ############################################################################################### if 0: from zato.server.generic.api.outconn.wsx.common import OnClosed, \ OnConnected, OnMessageReceived # ############################################################################################### # ############################################################################################### class DemoAkenza(WSXAdapter): # Our name name = 'demo.akenza' def on_connected(self, ctx:'OnConnected') -> 'None': self.logger.info('Akenza OnConnected -> %s', ctx) # ############################################################################################### def on_message_received(self, ctx:'OnMessageReceived') -> 'None': # Confirm what we received self.logger.info('Akenza OnMessageReceived -> %s', ctx.data) # This is an indication that we are connected .. if ctx.data['type'] == 'connected': # .. for testing purposes, use a fixed asset ID .. asset_id:'str' = 'abc123' # .. build our subscription message .. data = {'type': 'subscribe', 'subscriptions': [{'assetId': asset_id, 'topic': '*'}]} ctx.conn.send(data) else: # .. if we are here, it means that we received a message other than type "connected". self.logger.info('Akenza message (other than "connected") -> %s', ctx.data) # ############################################################################################## def on_closed(self, ctx:'OnClosed') -> 'None': self.logger.info('Akenza OnClosed -> %s', ctx) # ############################################################################################## # ############################################################################################## Now, deploy the code to Zato and create a new outgoing WebSocket connection. Replace the API key with your own and make sure to set the data format to JSON. Receiving Messages From WebSockets The WebSocket Python services that you author have three methods of interest, each reacting to specific events: on_connected: Invoked as soon as a WebSocket connection has been opened. Note that this is a low-level event and, in the case of Akenza, it does not mean yet that you are able to send or receive messages from it. on_message_received: The main method you work most of the time with. Invoked each time a remote WebSocket sends, or pushes an event to your service. With Akenza, this method will be invoked each time Akenza has something to inform you about, e.g., that you subscribed to messages, that. on_closed: Invoked when a WebSocket has been closed. It is no longer possible to use a WebSocket once it has been closed. Let's focus on on_message_received, which is where the majority of action takes place. It receives a single parameter of type OnMessageReceived, which describes the context of the received message. That is, it is in the "ctx" that you will both the current request as well as a handle to the WebSocket connection through which you can reply to the message. The two important attributes of the context object are: ctx.data: A dictionary of data that Akenza sent to you. ctx.conn: The underlying WebSocket connection through which the data was sent and through which you can send a response. Now, the logic from lines 30-40 is clear: First, we check if Akenza confirmed that we are connected (type=='connected'). You need to check the type of a message each time Akenza sends something to you and react to it accordingly. Next, because we know that we are already connected (e.g., our API key was valid), we can subscribe to events from a given IoT asset. For testing purposes, the asset ID is given directly in the source code, but, in practice, this information would be read from a configuration file or database. Finally, for messages of any other type, we simply log their details. Naturally, a full integration would handle them per what is required in given circumstances, e.g. by transforming and pushing them to other applications or management systems. A sample message from Akenza will look like this. INFO - WebSocketClient - Akenza message (other than "connected") -> {'type': 'subscribed', 'replyTo': None, 'timeStamp': '2023-11-20T13:32:50.028Z', 'subscriptions': [{'assetId': 'abc123', 'topic': '*', 'tagId': None, 'valid': True}], 'message': None} How To Send Messages to WebSockets An aspect not to be overlooked is communication in the other direction, that is, sending messages to WebSockets. For instance, you may have services invoked through REST APIs or perhaps from a scheduler, and their job will be to transform such calls into configuration commands for IoT devices. Here is the core part of such a service, reusing the same Akenza WebSocket connection: Python # -*- coding: utf-8 -*- # Zato from zato.server.service import Service # ############################################################################################## # ############################################################################################## class DemoAkenzaSend(Service): # Our name name = 'demo.akenza.send' def handle(self) -> 'None': # The connection to use conn_name = 'Akenza' # Get a connection .. with self.out.wsx[conn_name].conn.client() as client: # .. and send data through it. client.send('Hello') # ############################################################################################## # ############################################################################################## Note that responses to the messages sent to Akenza will be received using your first service's on_message_received method. WebSockets-based messaging is inherently asynchronous, and the channels are independent. Now, we have a complete picture of real-time IoT connectivity with Akenza and WebSockets. We are able to establish persistent, responsive connections to assets, and we can subscribe to and send messages to devices and that lets us build intelligent automation and integration architectures that make use of powerful, emerging technologies.

By Dariusz Suchojad
What Role Will Open-Source Hardware Play in Future Designs?
What Role Will Open-Source Hardware Play in Future Designs?

Open-source software has changed the face of the software industry for good. Now, open-source hardware promises to do the same for the electronics sector. These platforms have been popular among hobbyists for years. As the IoT grows and production costs rise, the trend is creeping into commercial integrated circuit design, too. Where it proceeds from here will reshape electronics design. Here’s what that could look like. Faster IoT Growth As more manufacturers capitalize on open-source hardware, the IoT will skyrocket. There could be more than 29 billion active IoT connections by 2027, but that growth isn’t possible if all other conditions remain unchanged from today. Materials are too expensive and production is too centralized to foster enough competition. The open-source movement changes things. Open-source designs dramatically lower the barrier to entry for electronics design, of which the IoT is undoubtedly the most promising category. Businesses can customize and produce devices without lengthy, expensive R&D. In some cases, they won’t even have to manufacture critical components themselves. This democratization of device development will foster significant growth. Smaller companies will be able to compete in the crowded but ever-growing IoT market, leading to an influx of new gadgets. Rapid Innovation Relatedly, the open-source movement will pave the way for more innovative devices. Access to ready-made, proven designs shortens production timelines enough to allow smaller, more risk-taking players to enter the field. If these end devices remain open-source, they’ll spur a wave of collaboration to drive device optimization further. Open-source designs let anyone with access to electronic circuit design software see and refine others’ work. Thousands — even millions — of businesses and hobbyists can collaborate to make high-functioning devices, just as open-source software has enabled innovation. One designer may have a good foundational idea but fail to identify all EMI issues. Another user could see their design and then develop and test an alteration that’s more resistant to interference. By speeding up the development process, a business is able to complete comprehensive EMC testing sooner to improve performance and ensure compliance. As a result, new, feature-rich devices can emerge that never would’ve made it to market otherwise. Consistent Security Standards This collaboration and innovation will prove particularly useful in cybersecurity. Like in open-source software, open-source hardware opens designs to input from multiple parties to find and patch vulnerabilities. As that happens, the IoT can overcome its prominent security shortcomings. Collaborative integrated circuit design will also make it easier to adhere to regulatory guidelines. Programs like the U.S. Cyber Trust Mark incentivize higher security standards, but achieving this recognition can be difficult for smaller, less experienced developers. Improving each other’s open-source designs makes it easier for more devices to meet these requirements. Businesses could distribute Cyber Trust Mark-compliant designs for others to build on, leading to a proliferation of high-security IoT endpoints. IoT security would improve across the board without jeopardizing interoperability. A Rise in Custom Products Because open-source designs streamline development, they leave more room for further tweaks. That could lead to custom electronics becoming more common across both consumer and commercial segments. Open-source components tend to be versatile by design. Devs can easily make small adjustments in electronic circuit design software to create one-off, purpose-built devices based on these modular platforms. Singular, one-size-fits-all electronics could give way to highly personalized options in some markets. Personalization is important, with 62% of consumers today saying a brand would lose their loyalty if it didn’t offer personalized experiences. Typically, this tailoring applies to marketing recommendations or UI preferences, but hardware can cash in on it through open-source designs. New Revenue Streams Similarly, open-source hardware paves the way for new monetization strategies. Democratized design and production make the electronics market fairer but could also make it crowded. However, businesses can counteract that competition by profiting from open-source designs and devices. Electronics companies could build and sell development kits containing basic microcontrollers and tools to program them. Instead of relying on proprietary devices outperforming others, they’d profit from making it easier to get into device design. Current open-source giants have found considerable success through this model. Alternatively, some enterprises could offer to manufacture smaller business’s open-source designs. Providing low-cost, small-scale manufacturing for devices using the same templates would be relatively straightforward but increasingly profitable as more parties try to capitalize on the open-source movement. Obstacles in the Road Ahead The extent of open-source hardware’s impact on electronics design is still uncertain. While it could likely lead to all these benefits, it also faces several challenges to mainstream adoption. The most significant of these is the volatility and high costs of the necessary raw materials. Roughly 70% of all silicon materials come from China. This centralization makes prices prone to fluctuations from local disruptions in China or throughout the supply chain. Similarly, long shipping distances raise related prices for U.S. developers. Even if integrated circuit design becomes more accessible, these costs keep production inaccessible, slowing open-source devices’ growth. Similarly, industry giants may be unwilling to accept the open-source movement. While open-source designs open new revenue streams, these market leaders profit greatly from their proprietary resources. The semiconductor fabs supporting these large companies are even more centralized. It may be difficult for open-source hardware to compete if these organizations don’t embrace the movement. These obstacles don’t mean open-source alternatives can’t be successful, but they cast a shadow over their large-scale industry impact. This movement has experienced significant growth lately, so it will likely change the electronics market in some capacity. How far that impact goes depends on how the industry responds to these challenges. Open-Source Hardware Will Shape the Future Open-source hardware may seem like a distant dream, but consider its software counterpart. It’s changed the industry despite the popularity of proprietary alternatives and development complications. When dev tools and methods become more accessible, the same thing could happen in hardware. As open-source integrated circuit design grows, it could make electronics a more diverse, resilient, and democratized industry. Achieving that will be difficult, but the results will benefit virtually all parties involved.

By Emily Newton
The Challenge of Building a Low-Power Network Reachable IoT Device
The Challenge of Building a Low-Power Network Reachable IoT Device

Building a battery-powered IOT device is a very interesting challenge. Advancements in technology have enabled IOT module chips to perform more functions than ever before. There are chipsets available today that are smaller than a penny in size, yet they are equipped with GPS, Wi-Fi, cellular connectivity, and application processing capabilities. With these developments, it is an opportune time for the creation of small form factor IoT devices that can connect to the cloud and solve interesting problems in all domains. There are plenty of applications where the IOT device needs to be accessed externally by an application to request a specific task to be executed. Think of a smart home lock device that needs to open and close the door lock from a mobile application. Or an asset tracking device that must start logging its location upon request. The basic idea is to be able to issue a command to these devices and make them perform an action. Problem Statement At first glance, it may seem pretty straightforward to achieve this. Connect the IoT device to the network, pair it with a server, and have it wait for commands. What is the big deal, one may ask? There are a few hurdles to making it work within a low-power battery-driven setup. Cellular radio is expensive on battery. Keeping the device constantly connected to the network will drain the battery really fast. Maintaining a constant network connection can consume a significant amount of power and reduce the battery life span to an unacceptable level for most battery-powered applications where the expected battery life span is in the months to years range. Such a device must be designed to conserve power as much as possible to extend its operational life. Applying first principles thinking to the problem of high power consumption caused by cellular connections, we can ask a critical question: Is it necessary for the device to remain connected to the network at all times? In most cases, the device may not need to transfer or receive any data, and it may not make sense to keep it connected to the network. What if we could make it stay in 'sleep' mode and periodically wake it up to check for any incoming commands? This would mean the device only connects to the network when necessary, hence saving power. This is the fundamental idea behind the LTE-M power-saving modes. There are two modes that LTE-M offers: PSM and eDRX. Let's look at them in detail. PSM Mode When in PSM mode, an IoT device can request to enter a dormant state (RRC Idle) for a duration of up to 413 days. This mode is particularly useful for devices that need to periodically transmit data over the network and then go back to sleep. It is important to note that the device is entirely inaccessible while it is in the dormant state. This is the power profile of a device in PSM mode. [Fig 1: PSM Power profile] The device remains in a dormant state for an extended period until it wakes up to transfer data. Additionally, there is a brief paging window during which the device is reachable, after which it returns to sleep mode. Requesting longer sleep times can result in greater power savings, but it may also require sacrificing data transfer frequency. Ultimately, it is up to the developer to determine the ideal tradeoff between sleep time and required data resolution, depending on the application requirements. A smart energy meter might be perfectly fine sending updates once per day, but a GPS tracker that needs to send more frequent updates would need shorter sleep times. eDRX Mode The term eDRX stands for "extended Discontinuous Reception". The concept is similar to PSM mode, where the device goes into RRC Idle, but unlike PSM mode, in the eDRX, the device wakes up periodically to "check-in" with the network every eDRX cycle period. The maximum sleep time for eDRX devices ranges from up to 43 minutes for devices using LTE-M. [Fig 2: eDRX power profile] Compared to eDRX, a PSM device requires significantly more time to wake up from sleep mode and remain active, as it must establish a connection with the network before it can receive application data. When using eDRX, the device only needs to wake up and listen for 1 ms, whereas with PSM, the device must wake up, receive, and transmit control messages for approximately 100-200 ms before it can receive a message from the cloud application, resulting in a 100-fold difference. [1] While these power modes seem promising for a low-power network IOT device, building a reachable device, i.e., a device that can send a message packet externally over the network, is still challenging. We will go into the details of the challenges that are encountered with TCP, SMS, and UDP-based communication protocols. Communication Protocol Considerations There are a variety of IOT communication protocols available for different kinds of IOT applications, but for the purpose of this article, we will categorize them into two kinds. Connection-based: TCP-based MQTT, Websockets, HTTP Connection-less: UDP-based MQTT-SN, CoAP, LWM2M and SMS. Connection Based Protocols These communication protocols require an active connection over the network for the data to transfer. The connection is established over a three-way handshake, and it looks something like this: [Fig 3: Three-way handshake illustration] Notice that before any actual payload data can be transferred (shown in green), there are three network payloads that need to be sent back and forth. This overhead may seem little for a system where data consumption or battery life isn’t an issue, but in our case, this would cost a lot of precious battery resources, sometimes consuming more data than the actual payload just to initiate a connection. Secondly, notice in the figure that the client is the one initiating the connection; this is intentional and a big limitation of connection-based (TCP, MQTT, etc.) protocols in the application of a network reachable device, which requires that the device responds to the server’s request. One could have the client initiate a first connection and keep the connection open, but to keep the connection alive, keep-alive packets need to be sent every pre-defined interval, which again costs a tremendous amount of battery, making these kinds of protocols unusable and impractical for a low power network reachable device. [Fig 4: TCP Based communication protocol power profile] Connection-Less Protocols Connection-less protocols require no active connection with the server for the data transfer to occur. These offer a much more promising outlook for the network-reachable IOT device application. Some examples of such protocols are MQTT-SN, CoAP, and LWM2M, which are implemented on top of UDP. These protocols are lightweight and optimized for power and resource consumption, working in favor of the resource-restricted platform they’re intended to be used on. There is, however, a logistical challenge with these networks with regard to the device being network reachable. Despite the protocols themselves being connection-less, the underlying network still requires occasional data transfer to maintain active NAT translations. NAT (Network Address Translation) devices, commonly used in home and enterprise networks, have timeouts for translation entries in their tables. If a device using connection-less protocols remains dormant for an extended period, exceeding the NAT timeout, the corresponding translation entry is purged, hence making the original client unreachable by the network. To address this, periodic keep-alive messages or other mechanisms have to be implemented to maintain NAT translations and ensure continuous network reachability for IoT devices using connection-less protocols but that is very expensive on the battery and not ideal for an IOT device where battery is a scarce resource. Conclusion As we have explored in this article, building a network-reachable device is a challenge on all layers, from device constraints to infrastructure limitations. UDP-based communication protocols offer a promising future for low-power network reachable devices because, fundamentally, the logistical challenge of the NAT timeouts, as mentioned above, can be eliminated using a private network and bypassing the NAT altogether. As enough interest grows in the need for network-reachable devices, we should see specific commercial network infrastructures that offer out-of-the-box solutions for these challenges.

By Sahas Chitlange
Enhancing IoT Security: The Role of Security Information and Event Management (SIEM) Systems
Enhancing IoT Security: The Role of Security Information and Event Management (SIEM) Systems

The rapid growth of the Internet of Things (IoT) has revolutionized the way we connect and interact with devices and systems. However, this surge in connectivity has also introduced new security challenges and vulnerabilities. IoT environments are increasingly becoming targets for cyber threats, making robust security measures essential. Security Information and Event Management (SIEM) systems, such as Splunk and IBM QRadar, have emerged as critical tools in bolstering IoT security. In this article, we delve into the pivotal role that SIEM systems play in monitoring and analyzing security events in IoT ecosystems, ultimately enhancing threat detection and response. The IoT Security Landscape Challenges and Complexities IoT environments are diverse, encompassing a wide array of devices, sensors, and platforms, each with its own set of vulnerabilities. The challenges of securing IoT include: Device diversity: IoT ecosystems comprise devices with varying capabilities and communication protocols, making them difficult to monitor comprehensively. Data volume: The sheer volume of data generated by IoT devices can overwhelm traditional security measures, leading to delays in threat detection. Real-time threats: Many IoT applications require real-time responses to security incidents. Delayed detection can result in significant consequences. Heterogeneous networks: IoT devices often connect to a variety of networks, including local, cloud, and edge networks, increasing the attack surface. The Role of SIEM Systems in IoT Security SIEM systems are designed to aggregate, analyze, and correlate security-related data from various sources across an organization's IT infrastructure. When applied to IoT environments, SIEM systems offer several key benefits: Real-time monitoring: SIEM systems provide continuous monitoring of IoT networks, enabling organizations to detect security incidents as they happen. This real-time visibility is crucial for rapid response. Threat detection: By analyzing security events and logs, SIEM systems can identify suspicious activities and potential threats in IoT ecosystems. This proactive approach helps organizations stay ahead of cyber adversaries. Incident response: SIEM systems facilitate swift incident response by alerting security teams to anomalies and security breaches. They provide valuable context to aid in mitigation efforts. Log management: SIEM systems collect and store logs from IoT devices, allowing organizations to maintain a comprehensive record of security events for auditing and compliance purposes. Splunk: A Leading SIEM Solution for IoT Security Splunk is a renowned SIEM solution known for its powerful capabilities in monitoring and analyzing security events, making it well-suited for IoT security. Key features of Splunk include: Data aggregation: Splunk can collect and aggregate data from various IoT devices and systems, offering a centralized view of the IoT security landscape. Advanced analytics: Splunk's machine-learning capabilities enable it to detect abnormal patterns and potential threats within IoT data streams. Real-time alerts: Splunk can issue real-time alerts when security events or anomalies are detected, allowing for immediate action. Custom dashboards: Splunk allows organizations to create custom dashboards to visualize IoT security data, making it easier for security teams to interpret and respond to events. IBM QRadar: Strengthening IoT Security Posture IBM QRadar is another SIEM solution recognized for its effectiveness in IoT security. It provides the following advantages: Threat intelligence: QRadar integrates threat intelligence feeds to enhance its ability to detect and respond to IoT-specific threats. Behavioral analytics: QRadar leverages behavioral analytics to identify abnormal activities and potential security risks within IoT networks. Incident forensics: QRadar's incident forensics capabilities assist organizations in investigating security incidents within their IoT environments. Compliance management: QRadar offers features for monitoring and ensuring compliance with industry regulations and standards, a critical aspect of IoT security. In the realm of IoT security, SIEM systems such as Splunk and IBM QRadar serve as indispensable guardians. Offering real-time monitoring, advanced threat detection, and swift incident response, they empower organizations to fortify their IoT ecosystems. As the IoT landscape evolves, the integration of these SIEM solutions becomes paramount in ensuring the integrity and security of connected devices and systems. Embracing these tools is a proactive stride toward a safer and more resilient IoT future.

By Deep Manishkumar Dave
Exploring IoT Integration With Wearable Devices From a Software Development Angle
Exploring IoT Integration With Wearable Devices From a Software Development Angle

In the vanguard of the ever-evolving digital world, the rapid increase of wearable contrivances and the expanding realm of the Internet of Things (IoT) have firmly gripped the world's technological spotlight. The amalgamation of IoT with wearable devices has arisen as a pivotal focus within the tech sphere, bestowing upon us an abundance of potential that ensures shifting progress across sectors as diverse as healthcare, fitness, transportation, and beyond. In the subsequent discourse, we shall embark on an in-depth exploration of the intricate domain that is the integration of IoT and wearable technology. Our journey will encompass an exploration of how significant it has become, the formidable challenges that lie in its path, and the driving force behind its software development. Surveying the Landscape of Wearable Devices From its humble beginnings as a simple wristwatch, wearable technology has evolved significantly. Nowadays, it encompasses a diverse range of devices, including fitness trackers, smart glasses, smart clothing, and even implantable medical tools. These cutting-edge technologies diligently gather data from both the human body and the surrounding environment. The information collected is then transmitted to other devices, networks, or cloud platforms for thorough examination and analysis. Some main distinguishing factors that make wearable devices unique include; Data Sensing: Wearable devices gather a wide range of information, encompassing measurements like heart rate, step count, body temperature, and various other metrics. Data Processing: Some wearables intricately handle data directly within them, while others send it to smartphones or the cloud for thorough analysis. Data Transmission: The seamless transmission of data often ensues through an array of communication protocols, including Bluetooth, Wi-Fi, or cellular networks. User Interface: Wearable devices should offer a multifaceted user experience filled with displays, auditory interfaces, and haptic feedback mechanisms. The Synergy of IoT Integration IoT represents a vast interconnected realm where devices harmoniously converse with one another and interface seamlessly with external systems. This interconnectedness contributes to the creation of a more astute and automated environment. When wearable devices seamlessly integrate into this IoT ecosystem, they not only coexist but also share and receive data from other connected devices, thus amplifying their functionality. This union ushers forth a multitude of advantages, including: Seamless Data Exchange: Wearable devices have increased value when they can readily exchange data with an array of IoT pairs such as smartphones, laptops, or home automation systems. Enhanced Monitoring: Practitioners in the healthcare industry can exercise real-time remote patient monitoring, simplifying the provision of timely and responsive care. Improved User Experience: By connecting wearables with smart home systems, users can create personalized experiences, such as setting the thermostat to their preferred temperature when they arrive home. Data Analytics: The amalgamated data streams from wearable devices and diverse IoT sensors transcend the mere collection of information; they metamorphose into valuable insights, offering glimpses into user behavior, health patterns, and beyond. Efficient Resource Management: Within sectors like manufacturing and logistics, IoT integration serves as the linchpin, granting businesses the power to optimize resource allocation, thus curbing operational expenses. Software Development for Wearable IoT Software development plays a pivotal role in ensuring the success of the rise of wearable technology integration. Developers need to consider various aspects when creating software for this complex environment: Device Compatibility Ensuring compatibility of software is the focal point when it comes to a multitude of wearable devices, each featuring its own distinct operating system, sensors, and capabilities. Developers are confronted with the formidable challenge of creating applications that go beyond device limitations, either by adopting designs that are compatible with other devices or by customizing software complexities for various platforms. Data Security Given the sensitive nature of the data collected by wearables, the utmost importance lies in data security. Encryption during transmission and robust security measures to thwart unauthorized access must be implemented. Data protection and secure authentication mechanisms must be prioritized by software developers. Low-Power Consumption Wearable devices often run on limited battery power. Therefore, the software must be optimized for energy efficiency. Developers need to write code that minimizes resource consumption and includes features like background processing and power-saving modes. Real-Time Data Processing The heartbeat of numerous wearable IoT applications echoes with the rhythm of real-time data processing. Here, software must perform a ballet of swift data analysis, offering timely insights or triggering actions in the blink of an eye. Pioneering the pursuit of minimal latency, developers sculpt codes that have a pattern of rapid data comprehension, thereby bestowing users with instantaneous responsiveness. Cloud Integration For many wearable devices, cloud integration is essential. Cloud platforms enable data storage, analysis, and accessibility from multiple devices. Software developers must create applications that can interact with cloud services with ease. Challenges and Considerations While IoT-wearable integration presents significant opportunities, it also comes with its share of challenges and considerations: Privacy Concerns Privacy concerns are aroused by the convergence of wearable technology and the gathering of personal data. Users may find the collection of sensitive health and location information unsettling. As a result, meticulous management by developers and organizations becomes essential, thus necessitating complete transparency and reliable opt-in/opt-out mechanisms. Regulatory Compliance Especially in the health-centric sphere, adherence to stringent regulations like the US's HIPAA is obligatory. Software developers bear the weighty responsibility of ensuring their applications align with these regulations, steering clear of legal entanglements. Data Quality and Accuracy In order for wearable gadgets to provide significant insights, it is imperative for the data accuracy to be non-negotiable. The responsibility of executing meticulous data validation and error-checking procedures falls on the shoulders of software developers. Accuracy holds utmost importance, as flawed data has the potential to compromise the reliability of conclusions and recommendations derived from analyses. Interoperability Ensuring that wearable devices can communicate with other IoT devices can be a complex task. Developers need to consider interoperability standards like Bluetooth, Zigbee, or Thread to achieve seamless integration. Conclusion The fusion of IoT with wearable technology stands as a catalytic force reshaping the very fabric of numerous sectors. From healthcare and fitness to agriculture and retail, this convergence propels industries into innovative realms. Navigating this landscape demands adept software development, an endeavor riddled with challenges encompassing device compatibility, impregnable data security, frugal power consumption, and real-time data processing. In this expedition of innovation, software developers emerge as visionary architects. Armed with skill and creativity, they transform these foundational elements into tangible reality, ushering in a future where technology augments every facet of our existence.

By Emmanuel Ohaba
Streaming ESP32-CAM Images to Multiple Browsers via MQTT
Streaming ESP32-CAM Images to Multiple Browsers via MQTT

In this tutorial, you'll learn how to publish images from an ESP32-CAM board to multiple browser clients using MQTT (Message Queuing Telemetry Transport). This setup will enable you to create a platform that functions similarly to a live video stream, viewable by an unlimited number of users. Prerequisites Before diving in, make sure you have completed the following prerequisite tutorials: Your First Xedge32 Project: This tutorial covers essential setup and configuration instructions for Xedge32 running on an ESP32. Your First MQTT Lua Program: This tutorial introduces the basics of MQTT and how to write a simple Lua program to interact with MQTT. By building on the knowledge gained from these foundational tutorials, you'll be better equipped to follow along with this tutorial. Publishing ESP32-CAM Images via MQTT In the MQTT CAM code, our primary focus is publishing images without subscribing to other events. This publishing operation is managed by a timer event, which publishes images based on the intervals specified. Setting Up the Timer First, let's create a timer object. This timer will trigger the publishImage function at specific intervals. timer = ba.timer(publishImage) To interact with the ESP32 camera, initialize a camera object like so: cam = esp32.cam(cfg) The cfg parameter represents a configuration table. Important: make sure it matches the settings for your particular ESP32-CAM module. See the Lua CAM API for details. Handling MQTT Connection Status For monitoring MQTT connections, use the following callback function: Lua local function onstatus(type, code, status) if "mqtt" == type and "connect" == code and 0 == status.reasoncode then timer:set(300, false, true) -- Activate timer every 300 milliseconds return true -- Accept connection end timer:cancel() return true -- Keep trying end The above function starts the timer when a successful MQTT connection is made. If the connection drops, it cancels the timer but will keep attempting to reconnect. Image Publishing via Timer Callback The core of the image publishing mechanism is the timer callback function, publishImage. This function captures an image using the camera object and publishes it via MQTT. The timer logic supports various timer types. Notably, this version operates as a Lua coroutine (akin to a thread). Within this coroutine, it continually loops and hibernates for the duration defined by the timer through coroutine.yield(true). Lua function publishImage() local busy = false while true do if mqtt:status() < 2 and not busy then busy = true -- thread busy ba.thread.run(function() local image = cam:read() mqtt:publish(topic, image) busy = false -- no longer running end) end coroutine.yield(true) -- sleep end end The above function maintains flow control by not publishing an image if two images already populate the MQTT client's send queue. The cam:read function can be time-consuming -- not in human time, but in terms of microcontroller operations. As such, we offload the task of reading from the CAM object onto a separate thread. While this step isn't strictly necessary, it enhances the performance of applications juggling multiple operations alongside reading from the CAM. For a deeper dive into the threading intricacies, you are encouraged to refer to the Barracuda App Server’s documentation on threading. The following shows the complete MQTT CAM code: Lua local topic = "/xedge32/espcam/USA/92629" local broker = "broker.hivemq.com" -- Settings for 'FREENOVE ESP32-S3 WROOM' CAM board local cfg={ d0=11, d1=9, d2=8, d3=10, d4=12, d5=18, d6=17, d7=16, xclk=15, pclk=13, vsync=6, href=7, sda=4, scl=5, pwdn=-1, reset=-1, freq="20000000", frame="HD" } -- Open the cam local cam,err=esp32.cam(cfg) assert(cam, err) -- Throws error if 'cfg' incorrect local timer -- Timer object; set below. -- MQTT connect/disconnect callback local function onstatus(type,code,status) -- If connecting to broker succeeded if "mqtt" == type and "connect" == code and 0 == status.reasoncode then timer:set(300,false,true) -- Activate timer every 300 milliseconds trace"Connected" return true -- Accept connection end timer:cancel() trace("Disconnect or connect failed",type,code) return true -- Keep trying end -- Create MQTT client local mqtt=require("mqttc").create(broker,onstatus) -- Timer coroutine function activated every 300 millisecond function publishImage() local busy=false while true do --trace(mqtt:status(), busy) -- Flow control: If less than 2 queued MQTT messages if mqtt:status() < 2 and not busy then busy=true ba.thread.run(function() local image,err=cam:read() if image then mqtt:publish(topic,image) else trace("cam:read()",err) end busy=false end) end coroutine.yield(true) -- sleep end end timer = ba.timer(publishImage) While we have already covered the majority of the program's functionality, there are a few aspects we haven't touched upon yet: Topic and Broker Configuration: local topic = "/xedge32/espcam/USA/92629": Sets the MQTT topic where the images will be published. Change this topic to your address. local broker = "broker.hivemq.com": Specifies the MQTT broker's address. The public HiveMQ broker is used in this example. ESP32 Camera Configuration (cfg): This block sets up the specific pin configurations and settings for your ESP32 CAM board. Replace these settings with those appropriate for your hardware. Creating the MQTT Client: The MQTT client is created with the require("mqttc").create(broker, onstatus) function, passing in the broker address and the onstatus callback. Creating the Timer Object for publishImage: The timer is created by calling ba.timer and passing in the publishImage callback, which will be activated at regular intervals. This is the mechanism that continually captures and publishes images. Subscribing to CAM Images With a JavaScript-Powered HTML Client To visualize the images published by the ESP32 camera, you can use an HTML client. The following client will subscribe to the same MQTT topic to which the camera is publishing images. The client runs purely in your web browser and does not require any server setup. The entire code for the HTML client is shown below: HTML <!DOCTYPE html> <html lang="en"> <head> <title>Cam Images Over MQTT</title> <script data-fr-src="https://cdnjs.cloudflare.com/ajax/libs/mqtt/5.0.0-beta.3/mqtt.min.js"></script> <script> const topic="/xedge32/espcam/USA/92629"; const broker="broker.hivemq.com"; window.addEventListener("load", (event) => { let img = document.getElementById("image"); let msg = document.getElementById("msg"); let frameCounter=0; const options = { clean: true, connectTimeout: 4000, port: 8884 // Secure websocket port }; const client = mqtt.connect("mqtts://"+broker+"/mqtt",options); client.on('connect', function () { msg.textContent="Connected; Waiting for images..."; client.subscribe(topic); }); client.on("message", (topic, message) => { const blob = new Blob([message], { type: 'image/jpeg' }); img.src = URL.createObjectURL(blob); frameCounter++; msg.textContent = `Frames: ${frameCounter}`; }); }); </script> </head> <body> <h2>Cam Images Over MQTT</h2> <div id="image-container"> <img id="image"/> </div> <p id="msg">Connecting...</p> </body> </html> MQTT JavaScript Client At the top of the HTML file, the MQTT JavaScript library is imported to enable MQTT functionalities. This is found within the <script data-fr-src=".......mqtt.min.js"></script> line. Body Layout The HTML body contains a <div> element with an id of "image-container" that will house the incoming images, and a <p> element with an id of "msg" that serves as a placeholder for status messages. MQTT Configuration In the JavaScript section, two constants topic and broker are defined. These must correspond to the topic and broker configurations in your mqttcam.xlua file. Connecting to MQTT Broker The client initiates an MQTT connection to the specified broker using the mqtt.connect() method. It uses a secure websocket port 8884 for this connection. Handling Incoming Messages Upon a successful connection, the client subscribes to the topic. Any incoming message on this topic is expected to be a binary JPEG image. The message is converted into a Blob and displayed as the source for the image element. Frame Counter A frameCounter variable keeps count of the incoming frames (or images) and displays this count as a text message below the displayed image. By having this HTML file open in a web browser, you'll be able to visualize in real-time the images that are being published to the specified MQTT topic. Preparing the Code Step 1: Prepare the Lua Script as Follows As explained in the tutorial Your First Xedge32 Project, when the Xedge32 powered ESP32 is running, use a browser and navigate to the Xedge IDE. Create a new Xedge app called "cam" and LSP enable the app. Expand the cam app now visible in the left pane tree view. Right-click the cam app and click New File in the context menu. Type camtest.lsp and click Enter. Open the camtest.lsp file at GitHub and click the copy raw file button. Go to the Xedge IDE browser window and paste the content into the camtest.lsp file. Important: Adjust the cfg settings in camtest.lsp to match your specific ESP32 CAM board. See the Lua CAM API for details. Click Save and then Click Open to test your cam settings. Make sure you see the image generated by the LSP script before proceeding. Right-click the cam app and click New File in the context menu. Type mqttcam.xlua and click Enter. Open the mqttcam.xlua file at GitHub and click the copy raw file button. Go to the Xedge IDE browser window and paste the content into the mqttcam.xlua file. Using the Xedge editor, update the topic variable /xedge32/espcam/USA/92629 in the Lua script to your desired MQTT topic. Important: Copy the cfg settings from camtest.lsp and replace the cfg settings in mqttcam.xlua with the settings you tested in step 9. Click the Save & Run button to save and start the example. Step 2: Prepare the HTML/JS File as follows Download mqttcam.html, open the file in any editor, and ensure the topic in the HTML file matches the topic you set in the Lua script. Save the mqttcam.html file. Open mqttcam.html: Double-click the mqttcam.html file or drag and drop it into your browser. Note: this file is designed to be opened directly from the file system. You do not need a web server to host this file. Observe the Output: The webpage will display the images being published by the ESP32 CAM. The number of frames received will be displayed below the image. Potential Issues with ESP32 CAM Boards and Solutions ESP32 CAM boards are widely recognized for their versatility and affordability. However, they're not without their challenges. One of the significant issues users might face with the ESP32 CAM boards is interference between the camera read operation and the built-in WiFi module. Let's delve into the specifics: Problem: Interference and WiFi Degradation When the ESP32 CAM board is in operation, especially during the camera's read operation, it can generate noise. This noise interferes with the built-in WiFi, which results in: Reduced Range: The distance over which the WiFi can effectively transmit and receive data can be notably decreased. Decreased Throughput: The speed and efficiency at which data is transmitted over the WiFi network can be considerably hampered. Solutions To combat these issues, consider the following solutions: Use a CAM Board with an External Antenna: Several ESP32 CAM boards come equipped with or support the use of an external antenna. By using such a board and connecting an external antenna, you can boost the WiFi signal strength and range, mitigating some of the interference caused by the camera operations. Integrate the W5500 Ethernet Chip: If your application demands consistent and robust data transmission, consider incorporating the W5500 Ethernet chip. By utilizing Ethernet over WiFi, you are effectively sidestepping the interference issues associated with WiFi on the ESP32 CAM board. Xedge32 is equipped with integrated Ethernet drivers. When paired with hardware that supports it, like the W5500 chip, it can facilitate smooth and interference-free data transfer, ensuring that your application remains stable and efficient. In conclusion, while the ESP32 CAM board is an excellent tool for a myriad of applications, it's crucial to be aware of its limitations and know how to circumvent them to ensure optimal performance. References Lua MQTT API Lua timer API Lua CAM API

By Wilfred Nilsen
The State of Data Streaming for Energy and Utilities in 2023
The State of Data Streaming for Energy and Utilities in 2023

This blog post explores the state of data streaming for the energy and utilities industry in 2023. The evolution of utility infrastructure, energy distribution, customer services, and new business models requires real-time end-to-end visibility, reliable and intuitive B2B and B2C communication, and integration with pioneering technologies like 5G for low latency or augmented reality for innovation. Data streaming allows integrating and correlating data in real-time at any scale to improve most workloads in the energy sector. I look at trends in the utilities sector to explore how data streaming helps as a business enabler, including customer stories from SunPower, 50hertz, Powerledger, and more. A complete slide deck and on-demand video recording are included. General Trends in the Energy and Utilities Industry The energy and utilities industry is fundamental for a sustainable future. Garter explores the Top 10 Trends Shaping the Utility Sector in 2023: "In 2023, power and water utilities will continue to face a variety of forces that will challenge their business and operating models and shape their technology investments. Utility technology leaders must confidently compose the future for their organizations in the midst of uncertainty during this energy transition volatile period — the future that requires your organizations to be both agile and resilient." From System-Centric and Large to Smaller-Scale and Distributed The increased use of digital tools makes the expected structural changes in the energy system possible: Energy AI Use Cases Artificial Intelligence (AI) with technologies like Machine Learning (ML) and Generative AI (GenAI) is a hot topic across all industries. Innovation around AI disrupts many business models, tasks, business processes, and labor. NVIDIA created an excellent diagram showing the various opportunities for AI in the energy and utilities sector. It separates the scenarios by segment: upstream, midstream, downstream, power generation, and power distribution. Cybersecurity: The Threat Is Real! McKinsey & Company explains that "the cyber threats facing electric-power and gas companies include the typical threats that plague other industries: data theft, billing fraud, and ransomware. However, several characteristics of the energy sector heighten the risk and impact of cyber threats against utilities:" Data Streaming in the Energy and Utilities Industry Adopting trends like predictive maintenance, track and trace, proactive sales and marketing, or threat intelligence is only possible if enterprises in the energy sector can provide and correlate information at the right time in the proper context. Real-time, which means using the information in milliseconds, seconds, or minutes, is almost always better than processing data later (whatever later means): Data streaming combines the power of real-time messaging at any scale with storage for true decoupling, data integration, and data correlation capabilities. 5 Ways Utilities Accomplish More With Real-Time Data “After creating a collaborative team that merged customer experience and digital capabilities, one North American utility went after a 30 percent reduction in its cost-to-serve customers in some of its core journeys.” As the Utilities Analytics Institute explains: "Utilities need to ensure that the data they are collecting is high quality, specific to their needs, preemptive in nature, and, most importantly, real-time." The following five characteristics are crucial to add value to real-time data: High-Quality Data Data Specific to Your Needs Make Your Data Proactive Data Redundancy Data is Constantly Changing Real-Time Data for Smart Meters and Common Praxis Smart meters are a perfect example of increasing business value with real-time data streaming. As Clou Global confirms: "The use of real-time data in smart grids and smart meters is a key enabler of the smart grid." Possible use cases include: Load Forecasting Fault Detection Demand Response Distribution Automation Smart Pricing Processing and correlating events from smart meters with stream processing is just one IoT use case. Cloud Adoption in Utilities and Energy Sector Accenture points out that 84% use Cloud SaaS solutions and 79% use Cloud PaaS Solutions in the energy and utilities market for various reasons: New approach to IT Incremental adoption Improved scalability, efficiency, agility, and security Unlock most business value. This is a general statistic, but this applies to all components in the data-driven enterprise, including data streaming. A company does not just move a specific application to the cloud; this would be counter-intuitive from a cost and security perspective. Hence, most companies start with a hybrid architecture and bring more and more workloads to the public cloud. Architecture Trends for Data Streaming The energy and utilities industry applies various trends for enterprise architectures for cost, flexibility, security, and latency reasons. The three major topics I see these days at customers are: Global data streaming Edge computing and hybrid cloud integration OT/IT modernization Let's look deeper into some enterprise architectures that leverage data streaming for energy and utilities use cases. Global Data Streaming Across Data Centers, Clouds, and the Edge Energy and utilities require data infrastructure everywhere. While most organizations have a cloud-first strategy, there is no way around running some workloads at the edge outside a data center for cost, security, or latency reasons. Data streaming is available everywhere: Data synchronization across environments, regions, and clouds is possible with open-source Kafka tools like MirrorMaker. However, this requires additional infrastructure and development/operations efforts. Innovative solutions like Confluent's Cluster Linking leverage the Kafka protocol for real-time replication. This enables much easier deployments and significantly reduced network traffic. Edge Computing and Hybrid Cloud Integration Kafka deployments look different depending on where it needs to be deployed. Fully managed serverless offerings like Confluent Cloud are highly recommended in the public cloud to focus on business logic with reduced time-to-market and TCO. In a private cloud, data center, or edge environment, most companies deploy on Kubernetes today to provide a similar cloud-native experience. Kafka can also be deployed on industrial PCs (IPC) and other industrial hardware. Many use cases exist for data streaming at the edge. Sometimes, a single broker (without high availability) is good enough. No matter how you deploy data streaming workloads, a key value is the unidirectional or bidirectional synchronization between clusters. Often, only curated and relevant data is sent to the cloud for cost reasons. Also, command and control patterns can start a business process in the cloud and send events to the edge. OT/IT Modernization With Data Streaming The energy sector operates many monoliths, inflexible and closed software and hardware products. This is changing in this decade. OT/IT modernization and digital transformation require open APIs, flexible scale, and decoupled applications (from different vendors). Many companies leverage Apache Kafka to build a postmodern data historian to complement or replace existing expensive OT middleware: Just to be clear: Kafka and any other IT software like Spark, Flink, Amazon Kinesis, and so on are NOT hard real-time. It cannot be used for safety-critical use cases with deterministic systems like autonomous driving or robotics. That is C, Rust, or other embedded software. However, data streaming connects the OT and IT worlds. As part of that, connectivity with robotic systems, intelligent vehicles, and other IoT devices is the norm for improving logistics, integration with ERP and MES, aftersales, etc. New Customer Stories for Data Streaming in the Energy and Utilities Sector So much innovation is happening in the energy and utilities sector. Automation and digitalization change how utilities monitor infrastructure, build customer relationships, and create completely new business models. Most energy service providers use a cloud-first approach to improve time-to-market, increase flexibility, and focus on business logic instead of operating IT infrastructure. Elastic scalability gets even more critical with all the growing networks, 5G workloads, autonomous vehicles, drones, and other innovations. Here are a few customer stories from worldwide energy and utilities organizations: 50hertz: A grid operator modernization of the legacy, monolithic, and proprietary SCADA infrastructure to cloud-native microservices and a real-time data fabric powered by data streaming. SunPower: Solar solutions across the globe where 6+ million devices in the field send data to the streaming platform. However, sensor data alone is not valuable! Fundamentals for delivering customer value include measurement ingestion, metadata association, storage, and analytics. aedifion: Efficient management of the real estate to operate buildings better and meet environmental, social, and corporate governance (ESG) goals. Secure connectivity and reliable data collection are implemented with Confluent Cloud (and deprecated the existing MQTT-based pipeline). Ampeers Energy: Decarbonization for the real estate. The service provides district management with IoT-based forecasts and optimization and local energy usage accounting. The real-time analytics of time-series data is implemented with OPC-UA, Confluent Cloud, and TimescaleDB. Powerledger: Green energy trading with blockchain-based tracking, tracing, and trading of renewable energy from rooftop solar power installations and virtual power plants. Non-fungible tokens (NFTs) representing renewable energy certificates (RECs) in. A decentralized rather than the conventional unidirectional market. Confluent Cloud ingests data from smart electricity meters. Resources To Learn More This blog post is just the starting point. Learn more about data streaming in the energy and utilities industry in the following on-demand webinar recording, the related slide deck, and further resources, including pretty cool lightboard videos about use cases. On-Demand Video Recording The video recording explores the telecom industry's trends and architectures for data streaming. The primary focus is the data streaming case studies. Slides If you prefer learning from slides, check out the deck used for the above recording: Slide Deck: The State of Data Streaming for Energy & Utilities in 2023 Case Studies and Lightboard Videos for Data Streaming in the Energy and Utilities Industry The state of data streaming for energy and utilities in 2023 is fascinating. New use cases and case studies come up every month. This includes better data governance across the entire organization, real-time data collection and processing data across hybrid edge and cloud infrastructures, data sharing and B2B partnerships for new business models, and many more scenarios. We recorded lightboard videos showing the value of data streaming simply and effectively. These five-minute videos explore the business value of data streaming, related architectures, and customer stories. Stay tuned; I will update the links in the next few weeks and publish a separate blog post for each story and lightboard video. And this is just the beginning. Every month, we will talk about the status of data streaming in a different industry. Manufacturing was the first. Financial services second, then retail, telcos, gaming, and so on... Check out my other blog posts. Let’s connect on LinkedIn and discuss it! Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter.

By Kai Wähner DZone Core CORE
A Guide to Data-Driven Design and Architecture
A Guide to Data-Driven Design and Architecture

This is an article from DZone's 2023 Data Pipelines Trend Report.For more: Read the Report Data-driven design is a game changer. It uses real data to shape designs, ensuring products match user needs and deliver user-friendly experiences. This approach fosters constant improvement through data feedback and informed decision-making for better results. In this article, we will explore the importance of data-driven design patterns and principles, and we will look at an example of how the data-driven approach works with artificial intelligence (AI) and machine learning (ML) model development. Importance of the Data-Driven Design Data-driven design is crucial as it uses real data to inform design decisions. This approach ensures that designs are tailored to user needs, resulting in more effective and user-friendly products. It also enables continuous improvement through data feedback and supports informed decision-making for better outcomes. Data-driven design includes the following: Data visualization – Aids designers in comprehending trends, patterns, and issues, thus leading to effective design solutions. User-centricity – Data-driven design begins with understanding users deeply. Gathering data about user behavior, preferences, and challenges enables designers to create solutions that precisely meet user needs. Iterative process – Design choices are continuously improved through data feedback. This iterative method ensures designs adapt and align with user expectations as time goes on. Measurable outcomes – Data-driven design targets measurable achievements, like enhanced user engagement, conversion rates, and satisfaction. This is a theory, but let's reinforce it with good examples of products based on data-driven design: Netflix uses data-driven design to predict what content their customers will enjoy. They analyze daily plays, subscriber ratings, and searches, ensuring their offerings match user preferences and trends. Uber uses data-driven design by collecting and analyzing vast amounts of data from rides, locations, and user behavior. This helps them optimize routes, estimate fares, and enhance user experiences. Uber continually improves its services by leveraging data insights based on real-world usage patterns. Waze uses data-driven design by analyzing real-time GPS data from drivers to provide accurate traffic updates and optimal route recommendations. This data-driven approach ensures users have the most up-to-date and efficient navigation experience based on the current road conditions and user behavior. Common Data-Driven Architectural Principles and Patterns Before we jump into data-driven architectural patterns, let's reveal what data-driven architecture and its fundamental principles are. Data-Driven Architectural Principles Data-driven architecture involves designing and organizing systems, applications, and infrastructure with a central focus on data as a core element. Within this architectural framework, decisions concerning system design, scalability, processes, and interactions are guided by insights and requirements derived from data. Fundamental principles of data-driven architecture include: Data-centric design – Data is at the core of design decisions, influencing how components interact, how data is processed, and how insights are extracted. Real-time processing – Data-driven architectures often involve real-time or near real-time data processing to enable quick insights and actions. Integration of AI and ML – The architecture may incorporate AI and ML components to extract deeper insights from data. Event-driven approach – Event-driven architecture, where components communicate through events, is often used to manage data flows and interactions. Data-Driven Architectural Patterns Now that we know the key principles, let's look into data-driven architecture patterns. Distributed data architecture patterns include the data lakehouse, data mesh, data fabric, and data cloud. Data Lakehouse Data lakehouse allows organizations to store, manage, and analyze large volumes of structured and unstructured data in one unified platform. Data lakehouse architecture provides the scalability and flexibility of data lakes, the data processing capabilities, and the query performance of data warehouses. This concept is perfectly implemented in Delta Lake. Delta Lake is an extension of Apache Spark that adds reliability and performance optimizations to data lakes. Data Mesh The data mesh pattern treats data like a product and sets up a system where different teams can easily manage their data areas. The data mesh concept is similar to how microservices work in development. Each part operates on its own, but they all collaborate to make the whole product or service of the organization. Companies usually use conceptual data modeling to define their domains while working toward this goal. Data Fabric Data fabric is an approach that creates a unified, interconnected system for managing and sharing data across an organization. It integrates data from various sources, making it easily accessible and usable while ensuring consistency and security. A good example of a solution that implements data fabric is Apache NiFi. It is an easy-to-use data integration and data flow tool that enables the automation of data movement between different systems. Data Cloud Data cloud provides a single and adaptable way to access and use data from different sources, boosting teamwork and informed choices. These solutions offer tools for combining, processing, and analyzing data, empowering businesses to leverage their data's potential, no matter where it's stored. Presto exemplifies an open-source solution for building a data cloud ecosystem. Serving as a distributed SQL query engine, it empowers users to retrieve information from diverse data sources such as cloud storage systems, relational databases, and beyond. Now we know what data-driven design is, including its concepts and patterns. Let's have a look at the pros and cons of this approach. Pros and Cons of Data-Driven Design It's important to know the strong and weak areas of the particular approach, as it allows us to choose the most appropriate approach for our architecture and product. Here, I gathered some pros and cons of data-driven architecture: PROS AND CONS OF DATA-DRIVEN DESIGN Pros Cons Personalized experiences: Data-driven architecture supports personalized user experiences by tailoring services and content based on individual preferences. Privacy concerns: Handling large amounts of data raises privacy and security concerns, requiring robust measures to protect sensitive information. Better customer understanding: Data-driven architecture provides deeper insights into customer needs and behaviors, allowing businesses to enhance customer engagement. Complex implementation: Implementing data-driven architecture can be complex and resource-intensive, demanding specialized skills and technologies. Informed decision-making: Data-driven architecture enables informed and data-backed decision-making, leading to more accurate and effective choices. Dependency on data availability: The effectiveness of data-driven decisions relies on the availability and accuracy of data, leading to potential challenges during data downtimes. Table 1 Data-Driven Approach in ML Model Development and AI A data-driven approach in ML model development involves placing a strong emphasis on the quality, quantity, and diversity of the data used to train, validate, and fine-tune ML models. A data-driven approach involves understanding the problem domain, identifying potential data sources, and gathering sufficient data to cover different scenarios. Data-driven decisions help determine the optimal hyperparameters for a model, leading to improved performance and generalization. Let's look at the example of the data-driven architecture based on AI/ML model development. The architecture represents the factory alerting system. The factory has cameras that shoot short video clips and photos and send them for analysis to our system. Our system has to react quickly if there is an incident. Below, we share an example of data-driven architecture using Azure Machine Learning, Data Lake, and Data Factory. This is only an example, and there are a multitude of tools out there that can leverage data-driven design patterns. The IoT Edge custom module captures real-time video streams, divides them into frames, and forwards results and metadata to Azure IoT Hub. The Azure Logic App watches IoT Hub for incident messages, sending SMS and email alerts, relaying video fragments, and inferencing results to Azure Data Factory. It orchestrates the process by fetching raw video files from Azure Logic App, splitting them into frames, converting inferencing results to labels, and uploading data to Azure Blob Storage (the ML data repository). Azure Machine Learning begins model training, validating data from the ML data store, and copying required datasets to premium blob storage. Using the dataset cached in premium storage, Azure Machine Learning trains, validates model performance, scores against the new model, and registers it in the Azure Machine Learning registry. Once the new ML inferencing module is ready, Azure Pipelines deploys the module container from Container Registry to the IoT Edge module within IoT Hub, updating the IoT Edge device with the updated ML inferencing module. Figure 1: Smart alerting system with data-driven architecture Conclusion In this article, we dove into data-driven design concepts and explored how they merge with AI and ML model development. Data-driven design uses insights to shape designs for better user experiences, employing iterative processes, data visualization, and measurable outcomes. We've seen real-world examples like Netflix using data to predict content preferences and Uber optimizing routes via user data. Data-driven architecture, encompassing patterns like data lakehouse and data mesh, orchestrates data-driven solutions. Lastly, our factory alerting system example showcases how AI, ML, and data orchestrate an efficient incident response. A data-driven approach empowers innovation, intelligent decisions, and seamless user experiences in the tech landscape. This is an article from DZone's 2023 Data Pipelines Trend Report.For more: Read the Report

By Boris Zaikin DZone Core CORE
The Evolution of Data Pipelines
The Evolution of Data Pipelines

This is an article from DZone's 2023 Data Pipelines Trend Report.For more: Read the Report Originally, the term "data pipeline" was focused primarily on the movement of data from one point to another, like a technical mechanism to ensure data flows from transactional databases to destinations such as data warehouses or to aggregate this data for analysis. Fast forward to the present day, data pipelines are no longer seen as IT operations but as a core component of a business's transformation model. New cloud-based data orchestrators are an example of evolution that allows integrating data pipelines seamlessly with business processes, and they have made it easier for businesses to set up, monitor, and scale their data operations. At the same time, data repositories are evolving to support both operational and analytical workloads on the same engine. Figure 1: Retail business replenishment process Consider a replenishment process for a retail company, a mission-critical business process, in Figure 1. The figure is a clear example of where the new data pipeline approach is having a transformational impact on the business. Companies are evolving from corporate management applications to new data pipelines that include artificial intelligence (AI) capabilities to create greater business impact. Figure 2 demonstrates an example of this. Figure 2: Retail replenishment data pipeline We no longer see a data process based on data movement, but rather we see a business process that includes machine learning (ML) models or integration with distribution systems. It will be exciting to see how data pipelines will evolve with the emergence of the new generative AI. Data Pipeline Patterns All data pipeline patterns are composed of the following stages, although each one of them has a workflow and use cases that make them different: Extract – To retrieve data from the source system without modifying it. Data can be extracted from several sources such as databases, files, APIs, streams, or more. Transform – To convert the extracted data into final structures that are designed for analysis or reporting. The data transformed is stored in an intermediate staging area. Load – To load the transformed data into the final target database. Currently and after the evolution of data pipelines, these activities are known as data ingestion and determine their pattern as we will see below. Here are some additional activities and components that are now part and parcel of modern data pipelines: Data cleaning is a crucial step in the data pipeline process that involves identifying and correcting inconsistencies and inaccuracies in datasets, such as removing duplicate records or handling missing values. Data validation ensures the data being collected or processed is accurate, reliable, and meets the specified criteria or business rules. This includes whether the data is of the correct type, falls within a specified range, or that all required data is present and not missing Data enrichment improves the quality, depth, and value of the dataset by adding relevant supplementary information that was not originally present with additional information from external sources. Machine learning can help enhance various stages of the pipeline from data collection to data cleaning, transformation, and analysis, thus making it more efficient and effective. Extract, Transform, Load Extract, transform, load (ETL) is a fundamental process pattern in data warehousing that involves moving data from the source systems to a centralized repository, usually a data warehouse. In ETL, all the load related to the transformation and storage of the raw data is executed in a layer previous to the target system. Figure 3: ETL data pipeline The workflow is as follows: Data is extracted from the source system. Data is transformed into the desired format in an intermediate staging area. Transformed data is loaded into the data warehouse. When to use this pattern: Target system performance – If the target database, usually a data warehouse, has limited resources and poor scalability, we want to minimize the impact on performance. Target system capacity – When the target systems have limited storage capacity or the GB price is very high, we are interested in transforming and storing the raw data in a cheaper layer. Pre-defined structure – When the structure of the target system is already defined. ETL Use Case This example is the classic ETL for an on-premise system where, because of the data warehouse computational and storage capacity, we are neither interested in storing the raw data nor in executing the transformation in the data warehouse itself. It is more economical and efficient when it involves an on-premises solution that is not highly scalable or at a very high cost. Figure 4: ETL sales insights data pipeline Extract, Load, Transform Modern cloud-based data warehouses and data lakes, such as Snowflake or BigQuery, are highly scalable and are optimized for in-house processing that allows for handling large-scale transformations more efficiently in terms of performance and cheaper in terms of cost. In extract, load, transform (ELT), the data retrieved in the source systems is loaded directly without transformation into a raw layer of the target system. Finally, the following transformations are performed. This pattern is probably the most widely used in modern data stack architectures. Figure 5: ELT data pipeline The workflow is as follows: Data is extracted from the source system. Data is loaded directly into the data warehouse. Transformation occurs within the data warehouse itself. When to use this pattern: Cloud-based modern warehouse – Modern data warehouses are optimized for in-house processing and can handle large-scale transformations efficiently. Data volume and velocity – When handling large amounts of data or near real-time processing. ELT Use Case New cloud-based data warehouse and data lake solutions are high-performant and highly scalable. In these cases, data repositories are better suited for work than external processes. The transformation process can take advantage of new features and run data transformation queries inside the data warehouse faster and at a lower cost. Figure 6: ELT sales insights data pipeline Reverse ETL Reverse ETL is a new data pattern that has grown significantly in recent years and has become fundamental for businesses. It is composed of the same stages as a traditional ETL, but functionally, it does just the opposite. It takes data from the data warehouse or data lake and loads it into the operational system. Nowadays, analytical solutions are generating information with a differential value for businesses and also in a very agile manner. Bringing this information back into operational systems allows it to be actionable across other parts of the business in a more efficient and probably higher impact way. Figure 7: Reverse ETL data pipeline The workflow is as follows: Data is extracted from the data warehouse. Data is transformed into the desired format in an intermediate staging area. Data is loaded directly into the operational system. When to use this pattern: Operational use of analytical data – to send back insights to operational systems to drive business processes Near real-time business decisions – to send back insights to systems that can trigger near real-time decisions Reverse ETL Use Case One of the most important things in e-commerce is to be able to predict what items your customers are interested in; this type of analysis requires different sources of information, both historical and real-time. The data warehouse contains historical and real-time data on customer behavior, transactions, website interactions, marketing campaigns, and customer support interactions. The reverse ETL process enables e-commerce to operationalize the insights gained from its data analysis and take targeted actions to enhance the shopping experience and increase sales. Figure 8: Reverse ETL customer insights data pipeline The Rise of Real-Time Data Processing As businesses become more data-driven, there's an increasing need to have actionable information as quickly as possible. This evolution has driven the transition from batch processing to real-time processing that allows processing the data immediately as it arrives. The advent of new technological tools and platforms that are capable of handling real-time data processing such as Apache Kafka, Apache Pulsar, or Apache Flink have made it possible to build real-time data pipelines. Real-time analytics became crucial for scenarios like fraud detection, IoT, edge computing, recommendation engines, and monitoring systems. Combined with AI, it allows businesses to make automatic, on-the-fly decisions. The Integration of AI and Advanced Analytics Advancements in AI, particularly in areas like generative AI and large language models (LLMs), will transform data pipeline landscape capabilities such as enrichment, data quality, data cleansing, anomaly detection, and transformation automation. Data pipelines will evolve exponentially in the coming years, becoming a fundamental part of the digital transformation, and companies that know how to take advantage of these capabilities will undoubtedly be in a much better position. Some of the activities where generative AI will be fundamental and will change the value and way of working with data pipelines include: Data cleaning and transformation Anomaly detection Enhanced data privacy Real-time processing AI and Advanced Analytics Use Case E-commerce platforms receive many support requests from customers every day, including a wide variety of questions and responses written in different conversational styles. Increasing the efficiency of the chatbot is so important to improve the customer experience, and many companies decide to implement a chatbot using a GPT model. In this case, we need to provide all information from questions, answers, and technical product documentation that is available in different formats. The new generative AI and LLM models not only allow us to provide innovative solutions for interacting with humans, but also to increase the capabilities of our data pipelines, such as data cleaning or transcriptions. Training the GPT model requires clean and preprocessed text data. This involves removing any personally identifiable information, correcting spelling, correcting grammar mistakes, or removing any irrelevant information. A GPT model can be trained to perform these tasks automatically. Figure 9: ETL data pipeline with AI for chatbot content ingestion Conclusion Data pipelines have evolved a lot in the last few years, initially with the advent of streaming platforms and later with the explosion of the cloud and new data solutions. This evolution means that every day they have a greater impact on business value, moving from a data movement solution to a key element in the business transformation. The explosive growth of generative AI solutions in the last year has opened up an exciting path, as they have a significant impact on all stages of the data pipelines; therefore, the near future is undoubtedly linked to AI. Such a disruptive evolution requires the adaptation of organizations, teams, and engineers to enable them to use the full potential of technology. The data engineer role must evolve to acquire more business and machine learning skills. This is a new digital transformation, and perhaps the most exciting and complex movement in recent decades. This is an article from DZone's 2023 Data Pipelines Trend Report.For more: Read the Report

By Miguel Garcia DZone Core CORE
Kafka Event Streaming AI and Automation
Kafka Event Streaming AI and Automation

Apache Kafka has emerged as a clear leader in corporate architecture for moving from data at rest (DB transactions) to event streaming. There are many presentations that explain how Kafka works and how to scale this technology stack (either on-premise or cloud). Building a microservice using ChatGPT to consume messages and enrich, transform, and persist is the next phase of this project. In this example, we will be consuming input from an IoT device (RaspberryPi) which sends a JSON temperature reading every few seconds. Consume a Message As each Kafka event message is produced (and logged), a Kafka microservice consumer is ready to handle each message. I asked ChatGPT to generate some Python code, and it gave me the basics to poll and read from the named "topic." What I got was a pretty good start to consume a topic, key, and JSON payload. The ChatGPT created code to persist this to a database using SQLAlchemy. I then wanted to transform the JSON payload and use API Logic Server (ALS - an open source project on GitHub) rules to unwarp the JSON, validate, calculate, and produce a new set of message payloads based on the source temperature outside a given range. Shell ChatGPT: “design a Python Event Streaming Kafka Consumer interface” Note: ChatGPT selected Confluent Kafka libraries (and using their Docker Kafka container)- you can modify your code to use other Python Kafka libraries. SQLAlchemy Model Using API Logic Server (ALS: a Python open-source platform), we connect to a MySQL database. ALS will read the tables and create an SQLAlchemy ORM model, a react-admin user interface, safrs-JSON Open API (Swagger), and a running REST web service for each ORM endpoint. The new Temperature table will hold the timestamp, the IoT device ID, and the temperature reading. Here we use the ALS command line utility to create the ORM model: Shell ApiLogicServer create --project_name=iot --db_url=mysql+pymysql://root:password@127.0.0.1:3308/iot The API Logic Server generated class used to hold our Temperature values. Python class Temperature(SAFRSBase, Base): __tablename__ = 'Temperature' _s_collection_name = 'Temperature' # type: ignore __bind_key__ = 'None' Id = Column(Integer, primary_key=True) DeviceId = Column(Integer, nullable=False) TempReading = Column(Integer, nullable=False) CreateDT = Column(TIMESTAMP, server_default=text("CURRENT_TIMESTAMP"), nullable=False) KafkaMessageSent = Column(Booelan, default=text("False")) Changes So instead of saving the Kafka JSON consumer message again in a SQL database (and firing rules to do the work), we unwrap the JSON payload (util.row_to_entity) and insert it into the Temperature table instead of saving the JSON payload. We let the declarative rules handle each temperature reading. Python entity = models.Temperature() util.row_to_entity(message_data, entity) session.add(entity) When the consumer receives the message, it will add it to the session which will trigger the commit_event rule (below). Declarative Logic: Produce a Message Using API Logic Server (an automation framework built using SQLAlchemy, Flask, and LogicBank spreadsheet-like rules engine: formula, sum, count, copy, constraint, event, etc), we add a declarative commit_event rule on the ORM entity Temperature. As each message is persisted to the Temperature table, the commit_event rule is called. If the temperature reading exceeds the MAX_TEMP or less than MIN_TEMP, we will send a Kafka message on the topic “TempRangeAlert”. We also add a constraint to make sure we receive data within a normal range (32 -132). We will let another event consumer handle the alert message. Python from confluent_kafka import Producer conf = {'bootstrap.servers': 'localhostd:9092'} producer = Producer(conf) MAX_TEMP = arg.MAX_TEMP or 102 MIN_TEMP = arg.MIN_TTEMP or 78 def produce_message( row: models.KafkaMessage, old_row: models.KafkaMessage, logic_row: LogicRow): if logic_row.isInserted() and row.TempReading > MAX_TEMP: produce(topic="TempRangeAlert", key=row.Id, value=f"The temperature {row.TempReading}F exceeds {MAX_TEMP}F on Device {row.DeviceId}") row.KafkaMessageSent = True if logic_row.isInserted() and row.TempReading < MIN_TEMP: produce(topic="TempRangeAlert", key=row.Id, value=f"The temperature {row.TempReading}F less than {MIN_TEMP}F on Device {row.DeviceId}") row.KafkaMessageSent = True Rules.constraint(models.Temperature, as_expression= lambda row: row.TempReading < 32 or row.TempReading > 132, error_message= "Temperature {row.TempReading} is out of range" Rules.commit_event(models.Temperature, calling=produce_message) Only produce an alert message if the temperature reading is greater than MAX_TEMP or less than MIN_TEMP. Constraint will check the temperature range before calling the commit event (note that rules are always unordered and can be introduced as specifications change). TDD Behave Testing Using TDD (Test Driven Development), we can write a Behave test to insert records directly into the Temperature table and then check the return value KafkaMessageSent. Behave begins with a Feature/Scenario (.feature file). For each scenario, we write a corresponding Python class using Behave decorators. Feature Definition Plain Text Feature: TDD Temperature Example Scenario: Temperature Processing Given A Kafka Message Normal (Temperature) When Transactions normal temperature is submitted Then Check KafkaMessageSent Flag is False Scenario: Temperature Processing Given A Kafka Message Abnormal (Temperature) When Transactions abnormal temperature is submitted Then Check KafkaMessageSent Flag is True TDD Python Class Python from behave import * import safrs db = safrs.DB session = db.session def insertTemperature(temp:int) -> bool: entity = model.Temperature() entity.TempReading = temp entity.DeviceId = 'local_behave_test' session.add(entity) return entity.KafkaMessageSent @given('A Kafka Message Normal (Temperature)') def step_impl(context): context.temp = 76 assert True @when('Transactions normal temperature is submitted') def step_impl(context): context.response_text = insertTemperature(context.temp) @then('Check KafkaMessageSent Flag is False') def step_impl(context): assert context.response_text == False Summary Using ChatGPT to generate the Kafka message code for both the Consumer and Producer seems like a good starting point. Install Confluent Docker for Kafka. Using API Logic Server for the declarative logic rules allows us to add formulas, constraints, and events to the normal flow of transactions into our SQL database and produce (and transform) new Kafka messages is a great combination. ChatGPT and declarative logic is the next level of "paired programming."

By Tyler Band

Top IoT Experts

expert thumbnail

Frank Delporte

Java Developer - Technical Writer,
CodeWriter.be

Frank Delporte is a technical writer at Azul, blogger on webtechie.be and foojay.io, author of "Getting started with Java on Raspberry Pi" (https://webtechie.be/books/), and contributor to Pi4J. Frank blogs about his experiments with Java, sometimes combined with electronic components, on the Raspberry Pi.
expert thumbnail

Tim Spann

Principal Developer Advocate,
Cloudera

https://github.com/tspannhw/SpeakerProfile/blob/main/README.md Tim Spann is a Principal Developer Advocate in Data In Motion for Cloudera. He works with Apache NiFi, Apache Pulsar, Apache Kafka, Apache Flink, Flink SQL, Apache Pinot, Trino, Apache Iceberg, DeltaLake, Apache Spark, Big Data, IoT, Cloud, AI/DL, machine learning, and deep learning. Tim has over a ten years of experience with the IoT, big data, distributed computing, messaging, streaming technologies, and Java programming. Previously, he was a Developer Advocate at StreamNative, Principal DataFlow Field Engineer at Cloudera, a Senior Solutions Engineer at Hortonworks, a Senior Solutions Architect at AirisData, a Senior Field Engineer at Pivotal and a Team Leader at HPE. He blogs for DZone, where he is the Big Data Zone leader, and runs a popular meetup in Princeton & NYC on Big Data, Cloud, IoT, deep learning, streaming, NiFi, the blockchain, and Spark. Tim is a frequent speaker at conferences such as ApacheCon, DeveloperWeek, Pulsar Summit and many more. He holds a BS and MS in computer science
expert thumbnail

Carsten Rhod Gregersen

Founder, CEO,
Nabto

Carsten Rhod Gregersen is the CEO and Founder of Nabto, a P2P IoT connectivity provider that enables remote control of devices with secure end-to-end encryption.
expert thumbnail

Emily Newton

Editor-in-Chief,
Revolutionized

Emily Newton is a journalist who regularly covers stories for the tech and industrial sectors. She loves seeing the impact technology can have on every industry.

The Latest IoT Topics

article thumbnail
Ultimate Guide to Smart Agriculture Systems Using IoT
Discover how the Internet of Things can revolutionize the agricultural industry.
November 28, 2023
by Katherine smith
· 575 Views · 1 Like
article thumbnail
Building A Simple AI Application in 2023 for Fun and Profit
Implementing your own AI-powered app project is appealing, given the amount of interest this segment of the software market has generated recently.
November 28, 2023
by Stylianos Kampakis
· 645 Views · 1 Like
article thumbnail
Navigating the API Seas: A Product Manager's Guide to Authentication
Learn about API keys for single-entity access, OAuth for third-party integration, and JWT for stateless authentication.
November 28, 2023
by Lakshmi Sushma Daggubati
· 716 Views · 1 Like
article thumbnail
IIoT and AI: The Synergistic Symphony Transforming Industrial Landscapes
IIoT and AI merge in a transformative synergy, optimizing industries through real-time data, predictive capabilities, and unparalleled efficiency.
November 28, 2023
by Animesh Patel
· 718 Views · 1 Like
article thumbnail
Are You Facing an Error When You Run the StartDagServerMaintenance.ps1 Script?
Learn how to address errors with the StartDagServerMaintenance.ps1 script in Exchange Server, and discover reliable recovery solutions for database issues.
November 28, 2023
by Shelly Bhardwaj
· 667 Views · 1 Like
article thumbnail
Real-Time Anomaly Detection
Emphasizing the importance of designing the analytics solution with the right tools for telemetry analytics solutions in Device Management solutions.
November 28, 2023
by Nagendran Sathananda Manidas
· 813 Views · 2 Likes
article thumbnail
Mastering Cloud Migration: Best Practices to Make it a Success
No two cloud migration processes are identical as each system has unique requirements. To get started, check out this article for tried and tested practices.
November 27, 2023
by Alexander Shumski
· 1,591 Views · 1 Like
article thumbnail
Securing the Cloud: Navigating the Frontier of Cloud Security
This article explores cloud security, including key considerations, best practices, and the evolving landscape of safeguarding data in the cloud.
November 27, 2023
by Animesh Patel
· 1,477 Views · 1 Like
article thumbnail
Effective Testing Strategies for Salesforce Custom Applications
Improve your Salesforce custom application testing with these effective strategies. Ensure your app is performing at its best.
November 27, 2023
by Nishant Sharma
· 1,258 Views · 1 Like
article thumbnail
ChatGPT Applications: Unleashing the Potential Across Industries
Applications of ChatGPT are changing all areas of our lives, at work and at home. But how can a business use it for its development?
November 27, 2023
by Kateryna Stetsiuk
· 1,424 Views · 1 Like
article thumbnail
Program To Force Copy SAP Bex Queries
This article aims to demonstrate how you can force copy Bex query designer queries from one Infoprovider to another, which have different structures.
November 27, 2023
by Maheshsingh Mony
· 1,109 Views · 1 Like
article thumbnail
Measuring Customer Support Sentiment Analysis With GPT, Airbyte, and MindsDB
Set up sentiment analysis of Intercom chats, extract and analyze the data with GPT models, and visualize the results using Metabase.
November 27, 2023
by John Lafleur
· 1,253 Views · 1 Like
article thumbnail
Speed Trino Queries With These Performance-Tuning Tips
In this article, we will discuss how data engineers and data infrastructure engineers can make Trino a widely used query engine that's faster and more efficient.
November 27, 2023
by Hope Wang
· 1,206 Views · 2 Likes
article thumbnail
Accelerate Innovation by Shifting Left FinOps, Part 2
FinOps is an evolving practice to deliver maximum value from investments in the cloud. In part 2 of this series, learn how to create and refine a cost model.
November 24, 2023
by Sreekanth Iyer
· 2,595 Views · 1 Like
article thumbnail
The Advantage of Using Cache to Decouple the Frontend Code
Decouple the code: You may think in regards to performance, it can be not simple, but using cache you can achieve it.
November 24, 2023
by Sergio Carracedo
· 2,730 Views · 3 Likes
article thumbnail
Effective Methods of Tackling Modern Cybersecurity Threats
Cybersecurity threats are increasing with technological advancements. This article covers how to handle common threats.
November 24, 2023
by Mouli Srini
· 2,475 Views · 1 Like
article thumbnail
Smart IoT Integrations With Akenza and Python
In this blog article, I explore the combination of Akenza and Python to open new avenues for intelligent, real-time IoT integrations and monitoring.
November 23, 2023
by Dariusz Suchojad
· 2,914 Views · 1 Like
article thumbnail
Empowering Connectivity: The Renaissance of Edge Computing in IoT
Edge Computing and IoT unite for real-time efficiency, bandwidth optimization, and innovation. Challenges persist, but AI and ML integration beckon.
November 23, 2023
by Animesh Patel
· 2,017 Views · 1 Like
article thumbnail
The Challenge of Building a Low-Power Network Reachable IoT Device
Balancing device constraints and infrastructure limitations poses challenges, but there's hope and opportunity for innovative solutions.
November 23, 2023
by Sahas Chitlange
· 1,770 Views · 1 Like
article thumbnail
Unlocking the Power of Streaming: Effortlessly Upload Gigabytes to AWS S3 With Node.js
This article guides you through building a Node.js application for efficient data uploads to Amazon S3, including setup, integration, and database storage.
November 23, 2023
by Vishal Diyora
· 2,498 Views · 3 Likes
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • ...
  • Next

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: