IoT, or the Internet of Things, is a technological field that makes it possible for users to connect devices and systems and exchange data over the internet. Through DZone's IoT resources, you'll learn about smart devices, sensors, networks, edge computing, and many other technologies — including those that are now part of the average person's daily life.
Crafting a reliable IIoT design is all about building on foundational architecture layers and customizing them for specific network needs. How an IIoT network is organized influences performance, optimization, and security. You can ensure you get the most out of your IIoT devices by integrating them into a well-designed network architecture. The Foundations of IIoT Design IIoT technology has advanced significantly over recent years, gaining more capabilities and applications. Adopting it on a large scale requires you to organize your network strategically. Before crafting your IIoT design, you must know precisely what you want to use the IIoT for, as it will influence your organization's ideal network architecture arrangement. Different use cases require alternate architecture layers. Layers of IIoT Architecture The three most basic layers are device, network, and application. Of course, your unique design can have more layers, but starting with these three is essential. The device layer includes the IIoT devices themselves, such as sensors, smart cameras, or wearables. Any physically connected smart device is part of this layer. The network layer serves as a bridge between the device layer and the rest of the IIoT design. It includes the devices and protocols for network communication, such as Wi-Fi, 5G, Bluetooth, or Ethernet. Your network layer is part of your IIoT architecture that allows the device layer to communicate with the rest of the network. The application layer is the final destination for data collected and transmitted throughout the other layers of the IIoT design. Infrastructure like physical data servers or cloud storage is part of this layer. This is where data is ultimately stored for use in other applications. Breaking Down Architecture Layers The three-layer IIoT architecture design is mainly a starting point for more complex structures today. Refining your IIoT design is a key part of building resilience. There are additional layers you can define to make your architecture more organized and further clarify how everything connects. The network layer can break down into transport and processing layers. In this configuration, data transport technology like LTE or 5G gets its own category. At the same time, the processing side of the larger network layer acts as a bridge to the application layer. In addition, the processing layer includes technologies like cloud storage and AI, which might otherwise be part of the application layer. If you have a lot of networking technology in your IIoT design or want to use advanced processing tools like AI, breaking up the layer may be wise. However, it's also important to thoroughly test and evaluate how different technologies work together. This will allow you to define interactions between your data transport and data processing technologies in greater detail. It will also ensure there aren't any bugs or errors when connecting new technologies. Similarly, you can further refine your processing organization by adding an analytics or visualization layer between the network and application layers. For example, IIoT devices collect data in the device layer, transmit it through the network layer and send it to the cloud for processing. This configuration is often used with edge or cloud computing for rapid data processing. However, the edge or a cloud server can also be isolated as its own individual layers. This elevates the important role those technologies might play in your IIoT design. An analytics or visualization layer makes sense of the raw data — it's essentially a translator between the data from the device layers and the end users or applications. You may analyze it for critical metrics or convert it into digestible graphs or charts. It formats data in a way a human could comprehend, or an app could utilize. The IIRA Framework In addition to purely technical layers, you can connect your IIoT design to your organization's goals. The Industry IoT Consortium has developed a unique framework for defining business-oriented layers in IIoT architecture. This model uses more open "viewpoints" rather than rigid layers. The four viewpoints are implementation, functional, usage, and business. The implementation layer includes the purely technical outline of the devices in an IIoT architecture, while the functional viewpoint defines how the devices and components in the implementation layer connect. You could fold a technical-focused IIoT architecture design into these first two viewpoints. The usage layer acts much like the application layer in a standard three-layer architecture. It outlines how to use the interconnected technologies in the first two layers for different tasks or activities. This is also where basic architecture capabilities are defined. Finally, the business viewpoint connects the devices, connections, and capabilities of the IIoT design to their business goals. It outlines exactly how the architecture contributes to stakeholders' goals and priorities or resolves their concerns. This layer can be helpful if you adopt IIoT with a specific business end goal in mind. For example, your company might be using IIoT to reduce operating costs. Defining a business layer in your IIoT design establishes concrete connections between how your IIoT network functions and how you will achieve those goals. Prioritize Security in Every Layer Experts estimate there are over 10 million IoT cyber attacks every month as of Q4 2022. While IIoT devices have many advantages for businesses, they are notorious for their poor security. Therefore, you must ensure your IIoT design includes proactive security measures. In 2020, the U.S. Congress passed the IoT Cybersecurity Improvement Act, establishing minimum IoT security standards. The regulations outlined in this act will strengthen the minimum security protocols on newly manufactured IoT devices. Additionally, you can use network segmentation — or splitting the network layer of your architecture into isolated silos with dedicated security protocols. Network segmentation limits a hacker's damage to the single silo they compromise rather than the whole network. Building a Secure, Reliable IoT Design Crafting a reliable IoT design is all about understanding your unique IIoT needs and applying those to your network organization. Start with the three basic layers and add more to your architecture to refine it for your needs. At every architecture layer, make sure to prioritize cybersecurity measures to build network resiliency.
The Internet of Things (IoT) has been transforming the way we live and work, and it is expected to continue to do so in the years to come. The ability to connect various devices and objects to the internet has opened up new opportunities for businesses and individuals alike. One of the most important technologies powering IoT is the MQTT protocol. In this article, we will explore seven MQTT technology trends that are shaping the future of IoT. Seven MQTT Technology Trends in 2023 1. MQTT 5.0 MQTT 5.0 is the latest protocol version and was released in 2018. It introduced many new features, making it more powerful and flexible than its predecessors. One of the most important additions is the ability to send structured data, including metadata, which allows for more efficient and accurate data exchange. Additionally, MQTT 5.0 supports bi-directional communication, enabling devices to send messages back to the broker. 2. Edge Computing Edge computing is becoming increasingly important in the world of IoT, and MQTT plays a crucial role in enabling it. Edge computing involves processing data locally on the devices rather than sending it back to a central server. This reduces the amount of data that needs to be transmitted, improves latency, and can reduce costs. MQTT is ideal for edge computing because it is lightweight and easy to implement on resource-constrained devices. 3. Security Security is always a major concern regarding IoT, and MQTT is no exception. MQTT 5.0 includes several new security features, including support for mutual authentication, ensuring that the client and the broker are who they claim to be. Additionally, MQTT 5.0 includes the ability to encrypt messages end-to-end, which provides an extra layer of protection against eavesdropping and other forms of cyberattacks. 4. QoS Quality of Service (QoS) is a critical factor in IoT, and MQTT provides several options for controlling it. MQTT allows users to specify the level of QoS they require for each message they send. There are three levels of QoS: QoS 0 (at most once), QoS 1 (at least once), and QoS 2 (exactly once). Choosing the appropriate level of QoS is important because it affects the reliability and speed of message delivery. 5. Interoperability Interoperability is another important consideration in IoT, and MQTT is designed to be highly interoperable. It is a standard protocol supported by a wide range of devices and platforms, making it easier for different devices to communicate with each other. Additionally, MQTT supports a wide range of data formats, including JSON, which makes it easier to exchange data between devices with different data formats. 6. Cloud Integration Cloud integration is becoming increasingly important in IoT, and MQTT is ideal for integrating with cloud platforms. MQTT can connect devices to cloud-based brokers, making it easier to manage large numbers of devices and process large amounts of data. Additionally, many cloud platforms, such as AWS IoT, provide native support for MQTT, which makes it easy to set up and use. 7. AI and Machine Learning Finally, MQTT is well-suited for use in conjunction with AI and machine learning. MQTT can be used to send data from devices to machine learning models for analysis and processing. Additionally, MQTT can be used to send commands back to devices based on the results of the machine learning models. This makes it possible to create intelligent, self-learning IoT systems that can adapt and improve over time. Conclusion In conclusion, MQTT is a powerful and versatile protocol that is helping to shape the future of IoT. Its lightweight and flexible nature makes it ideal for use in various applications. Furthermore, its support for edge computing, security, QoS, interoperability, cloud integration, and AI and machine learning make it a key technology for the future of IoT. Furthermore, as the number of connected devices continues to grow, MQTT will play an increasingly important role in enabling efficient and secure data exchange between them. One of the key advantages of MQTT is its simplicity, which makes it easy to implement and use. This, combined with its interoperability and flexibility, has made it a popular choice for many IoT applications, from smart homes to industrial automation. As the demand for IoT continues to grow, we can expect to see even more innovations in MQTT and other IoT technologies. For example, we may see further security improvements, such as integrating blockchain technology, which can provide additional protection against cyberattacks. We may also see the development of new standards and protocols specifically designed for IoT, which could further improve interoperability and reliability. Overall, MQTT is a technology that is well-positioned to shape the future of IoT. Its versatility, flexibility, and ease of use make it an ideal protocol for various applications, from consumer electronics to industrial automation. As IoT continues to evolve, we can expect MQTT to play an increasingly important role in enabling efficient and secure data exchange between devices and in facilitating the development of intelligent, self-learning systems.
As the digital age progresses, the need for efficient and secure data governance practices becomes more crucial than ever. This article delves into the concept of User Data Governance and its implementation using serverless streaming. We will explore the benefits of using serverless streaming for processing user data and how it can lead to improved data governance and increased privacy protection. Additionally, we will provide code snippets to illustrate the practical implementation of serverless streaming for user data governance. Introduction User Data Governance refers to the management of user data, including its collection, storage, processing, and protection. With the ever-increasing amount of data generated daily, organizations must develop robust and efficient data governance practices to ensure data privacy, security, and compliance with relevant regulations. In recent years, serverless computing has emerged as a promising solution to the challenges of data governance. This paradigm shift allows organizations to build and run applications without managing the underlying infrastructure, enabling them to focus on their core business logic. Serverless streaming, in particular, has shown great potential in processing large volumes of user data in real time, with minimal latency and scalable performance. Serverless Streaming for User Data Processing Serverless streaming is a cloud-based architecture that enables real-time data processing without the need to provision or manage servers. It provides on-demand scalability and cost-effectiveness, making it an ideal choice for processing large volumes of user data. This section examines the key components of serverless streaming for user data governance. 1.1. Event Sources An event source is any system or application that generates data in real time. These sources can include user activity logs, IoT devices, social media feeds, and more. By leveraging serverless streaming, organizations can ingest data from these diverse sources without worrying about infrastructure management. For example, consider an AWS Kinesis data stream that ingests user activity logs: Python import boto3 kinesis_client = boto3.client('kinesis', region_name='us-west-2') response = kinesis_client.create_stream( StreamName='UserActivityStream', ShardCount=1 ) 1.2. Stream Processing Stream processing involves the real-time analysis of data as it is generated by event sources. Serverless platforms, such as AWS Lambda, Google Cloud Functions, and Azure Functions, enable developers to create functions that process data streams without managing the underlying infrastructure. These functions can be triggered by specific events, allowing for the real-time processing of user data. For instance, an AWS Lambda function that processes user activity logs from the Kinesis data stream: Python import json import boto3 def lambda_handler(event, context): for record in event['Records']: payload = json.loads(record['kinesis']['data']) process_user_activity(payload) def process_user_activity(activity): # Process user activity data here pass 1.3. Data Storage The processed data must be stored securely to ensure proper data governance. Serverless storage solutions, such as Amazon S3, Google Cloud Storage, and Azure Blob Storage, offer scalable and secure storage options that automatically scale with the size of the data. For example, storing processed user activity data in an Amazon S3 bucket: Python import boto3 s3_client = boto3.client('s3') def store_processed_data(data, key): s3_client.put_object( Bucket='my-processed-data-bucket', Key=key, Body=json.dumps(data) ) Benefits of Serverless Streaming for User Data Governance The serverless streaming architecture offers several benefits for user data governance, including: 2.1. Scalability One of the main advantages of serverless streaming is its ability to scale automatically based on the volume of incoming data. This ensures that organizations can handle fluctuating workloads, such as seasonal trends or unexpected surges in user activity, without the need to over-provision resources. 2.2. Cost-Effectiveness Serverless streaming follows a pay-as-you-go pricing model, meaning organizations only pay for the resources they actually consume. This eliminates the need for upfront investments in infrastructure and reduces overall operational costs. 2.3. Flexibility Serverless streaming allows organizations to process data from multiple event sources and adapt their data processing pipelines to changing business requirements quickly. This flexibility enables them to stay agile and responsive to evolving user data governance needs. 2.4. Security With serverless streaming, organizations can implement various security measures, such as encryption, data masking, and access control, to protect user data at rest and in transit. Additionally, serverless platforms typically offer built-in security features, such as automatic patching and monitoring, to ensure the highest level of data protection. Compliance and Privacy in Serverless Streaming As organizations adopt serverless streaming for user data governance, they must address several privacy and compliance concerns, including: 3.1. Data Sovereignty Data sovereignty refers to the concept that data should be stored and processed within the borders of the country where it was generated. Serverless streaming platforms must support multi-region deployment to comply with data sovereignty requirements and ensure proper user data governance. 3.2. GDPR and Other Data Protection Regulations Organizations must adhere to the General Data Protection Regulation (GDPR) and other data protection laws when processing user data. Serverless streaming platforms should provide features to facilitate compliance, such as data anonymization, deletion, and consent management. 3.3. Privacy by Design Privacy by Design is a proactive approach to data privacy that embeds privacy considerations into the design and architecture of systems and processes. Serverless streaming platforms should support Privacy by Design principles, enabling organizations to implement privacy-enhancing techniques and best practices. Best Practices for Implementing User Data Governance With Serverless Streaming To ensure robust user data governance using serverless streaming, organizations should follow these best practices: 4.1. Assess Data Sensitivity Before processing user data, organizations should evaluate the sensitivity of the data and apply appropriate security measures based on the data classification. 4.2. Encrypt Data at Rest and in Transit Data should be encrypted both at rest (when stored) and in transit (during processing and transmission) to protect against unauthorized access. 4.3. Implement Access Control Organizations should implement strict access control policies to limit who can access and process user data. This includes role-based access control (RBAC) and the principle of least privilege (POLP). 4.4. Monitor and Audit Continuous monitoring and auditing of serverless streaming platforms are essential to ensure data governance, detect security incidents, and maintain compliance with relevant regulations. 4.5. Leverage Data Retention Policies Organizations should implement data retention policies to ensure that user data is stored only for the duration necessary and is deleted when no longer needed. Conclusion User Data Governance is an essential aspect of modern digital businesses, and serverless streaming offers a promising approach to address its challenges. By leveraging the scalability, cost-effectiveness, and flexibility of serverless streaming, organizations can process and manage large volumes of user data more efficiently and securely. By adhering to best practices and regulatory requirements, organizations can ensure robust user data governance and privacy protection using serverless streaming.
IoT devices have revolutionized the way businesses collect and utilize data. IoT devices generate an enormous amount of data that can provide valuable insights for informed decision-making. However, processing this data in real time can be a significant challenge, particularly when managing large data volumes from numerous sources. This is where Apache Kafka and Kafka data streams come into play. Apache Kafka is a distributed streaming platform that can handle large amounts of data in real time. It is a messaging system commonly used for sending and receiving data between systems and applications. It can also be used as a data store for real-time processing. Kafka data streams provide a powerful tool for processing and analyzing data in real time, enabling real-time analytics and decision-making. One of the most important applications of Kafka data streams is real-time monitoring. IoT devices can be used to monitor various parameters, such as temperature, humidity, and pressure. By using Kafka data streams, this data can be processed and analyzed in real time, allowing for early detection of issues and immediate response. This can be particularly beneficial in manufacturing, where IoT devices can monitor machine performance and alert maintenance personnel to potential problems. Another application of Kafka data streams is predictive maintenance. By analyzing IoT data in real-time using Kafka data streams, it is possible to predict when maintenance will be required on devices. This can help to prevent downtime and reduce maintenance costs. For instance, sensors in vehicles can monitor engine performance and alert the driver to potential problems before they cause a breakdown. Energy management is another area where IoT devices can be leveraged using Kafka data streams. IoT devices can be used to monitor energy consumption in real time. By using Kafka data streams, this data can be analyzed to identify energy-saving opportunities and optimize energy usage. For example, smart buildings can use sensors to monitor occupancy and adjust heating and cooling systems accordingly. Smart cities are another application of Kafka data streams for IoT devices. IoT devices can be used to monitor and control various aspects of city life, such as traffic flow, air quality, and waste management. By using Kafka data streams, this data can be processed and analyzed in real-time, allowing for quick response to changing conditions and improved quality of life for residents. For example, sensors in smart traffic lights can adjust the timing of the lights to reduce congestion and improve traffic flow. One of the advantages of using Kafka data streams for IoT devices is that it enables real-time analytics and decision-making. This is important because it allows businesses to respond quickly to changing conditions and make informed decisions based on current data. The real-time nature of Kafka data streams means that businesses can monitor and analyze data as it is generated rather than waiting for batch processing to occur. This enables businesses to be more agile and responsive. We are using Apache Camel to consume IoT data and write it to a Kafka topic: Java import org.apache.camel.builder.RouteBuilder; import org.apache.camel.component.kafka.KafkaConstants; import org.apache.camel.model.dataformat.JsonLibrary; public class RestApiToKafkaRoute extends RouteBuilder { @Override public void configure() throws Exception { // Set up Kafka component from("kafka:{{kafka.bootstrap.servers}") .routeId("kafka") .to("log:received-message"); // Set up REST API component from("timer://rest-api-timer?period={{rest.api.timer.period}") .routeId("rest-api") .to("rest:get:{{rest.api.url}") .unmarshal().json(JsonLibrary.Jackson, DeviceData.class) .split(body()) .process(exchange -> { // Extract device ID from data and set Kafka topic header DeviceData deviceData = exchange.getIn().getBody(DeviceData.class); String deviceId = deviceData.getDeviceId(); exchange.getMessage().setHeader(KafkaConstants.TOPIC, deviceId); }) .marshal().json(JsonLibrary.Jackson) .to("kafka:{{kafka.topic}"); } } KSQL is a streaming SQL engine for Apache Kafka. It enables real-time data processing and analysis by providing a simple SQL-like language for working with Kafka data streams. KSQL makes it easy to create real-time dashboards and alerts that can be used for monitoring and decision-making. Real-time dashboards are an important tool for monitoring IoT devices using Kafka data streams. Dashboards can be used to display key performance indicators (KPIs) in real time, allowing businesses to monitor the health and performance of their IoT devices. Dashboards can also be used to visualize data trends and patterns, making it easier to identify opportunities for optimization and improvement. Alerts are another important tool for monitoring IoT devices using Kafka data streams. Alerts can be used to notify businesses when certain conditions are met, such as when a device exceeds a certain threshold or when a potential issue is detected. Alerts can be sent via email, SMS, or other means, allowing businesses to respond quickly to potential issues. SQL sample Kql query for dashboard for IOT data alerts: CREATE TABLE pressure_alerts AS SELECT device_id, pressure FROM iot_data_stream WHERE pressure > 100; CREATE STREAM pressure_alerts_stream (device_id VARCHAR, pressure INT, alert_type VARCHAR) WITH (kafka_topic='pressure_alerts', value_format='JSON'); CREATE TABLE pressure_alert_count AS SELECT alert_type, COUNT(*) FROM pressure_alerts_stream WINDOW TUMBLING (SIZE 1 MINUTE) GROUP BY alert_type; SELECT * FROM pressure_alert_count; KSQL also provides a real-time dashboard for monitoring and visualizing data in Kafka data streams. The dashboard can display real-time data streams and visualizations and can be used to track performance metrics and detect anomalies in real time. This enables users to gain real-time insights and make informed decisions based on the data. Below sample program that enables the consumption of data from a Kafka topic and issues an alert based on a predetermined threshold limit, as shown in the example where the pressure level exceeded 100. Java import com.twilio.Twilio; import com.twilio.rest.api.v2010.account.Message; import com.twilio.type.PhoneNumber; import org.apache.kafka.clients.consumer.ConsumerConfig; import org.apache.kafka.common.serialization.Serdes; import org.apache.kafka.streams.KafkaStreams; import org.apache.kafka.streams.StreamsBuilder; import org.apache.kafka.streams.StreamsConfig; import org.apache.kafka.streams.kstream.KStream; import org.apache.kafka.streams.kstream.Produced; import java.util.Properties; public class AlertTrigger { // Set Twilio Account SID and Auth Token public static final String ACCOUNT_SID = "your_account_sid_here"; public static final String AUTH_TOKEN = "your_auth_token_here"; // Set Twilio phone number and mobile app endpoint public static final String TWILIO_PHONE_NUMBER = "+1234567890"; public static final String MOBILE_APP_ENDPOINT = "https://your.mobile.app/endpoint"; public static void main(String[] args) { // Set up properties for Kafka Streams Properties props = new Properties(); props.put(StreamsConfig.APPLICATION_ID_CONFIG, "alert-trigger"); props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092"); props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass()); props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass()); props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest"); // Build Kafka Streams topology StreamsBuilder builder = new StreamsBuilder(); // Read data from Kafka topic KStream<String, String> input = builder.stream("iot-data"); // Define KSQL query for alert trigger String ksql = "SELECT device_id, pressure FROM iot-data WHERE pressure > 100"; // Create Kafka Streams application and start processing KafkaStreams streams = new KafkaStreams(builder.build(), props); streams.start(); // Set up Twilio client Twilio.init(ACCOUNT_SID, AUTH_TOKEN); // Process alerts input.filter((key, value) -> { // Execute KSQL query to check for alert condition // If pressure is greater than 100, trigger alert // Replace this with your actual KSQL query return true; }) .mapValues(value -> { // Create alert message String message = "Pressure has exceeded threshold value of 100!"; return message; }) .peek((key, value) -> { // Send notification to mobile app endpoint Message message = Message.creator(new PhoneNumber(MOBILE_APP_ENDPOINT), new PhoneNumber(TWILIO_PHONE_NUMBER), value).create(); }) .to("alert-topic", Produced.with(Serdes.String(), Serdes.String())); // Gracefully shut down Kafka Streams application Runtime.getRuntime().addShutdownHook(new Thread(streams::close)); } } Overall, Apache Kafka and Kafka data streams, combined with Kafka Connect and KSQL, offer a powerful toolset for processing and analyzing real-time data from IoT devices. By integrating IoT devices with Kafka data streams, organizations can gain real-time insights and improve decision-making, leading to significant improvements in efficiency, cost savings, and quality of life. The KSQL dashboard provides a powerful way to visualize and monitor the data in real time, allowing users to quickly identify trends, anomalies, and potential issues. With the continued growth of IoT devices and the increasing demand for real-time analytics, Kafka data streams and KSQL are likely to become even more important in the years to come.
Message Queuing Telemetry Transport (MQTT) is the standard messaging protocol for the Internet of Things (IoT). MQTT follows an extremely lightweight publish-subscribe messaging model, connecting IoT devices in a scalable, reliable, and efficient manner. It's been over 20 years since MQTT was invented in 1999 by IBM and 10 years since the popular open-source MQTT broker, EMQX, launched on GitHub in 2012. As we move into 2023 and look forward to the years ahead, we can anticipate 7 developing trends in MQTT technology, as the use of MQTT in IoT is growing tremendously and diversely, driven by the progress of emerging technologies. MQTT Over QUIC Quick UDP Internet Connections (QUIC) is a new transport protocol developed by Google that runs over UDP and is designed to reduce the latency associated with establishing new connections, increase data transfer rates, and address the limitations of TCP. HTTP/3, the latest HTTP protocol version, uses QUIC as its transport layer. HTTP/3 has lower latency and a better loading experience on web applications than HTTP/2 due to the adoption of QUIC. MQTT over QUIC is the most innovative advancement in the MQTT protocol since the first release of the MQTT 5.0 specification in 2017. With multiplexing and faster connection establishment and migration, it has the potential to become the next generation of the MQTT standard. The MQTT 5.0 protocol specification defines three types of transport: TCP, TLS, and WebSocket. MQTT over TLS/SSL is widely used in production to secure communications between MQTT clients and brokers, as security is a top priority for IoT applications. However, it is slow and has high latency, requiring 7 RTT handshakes, 3 TCP, and 4 TLS to establish a new MQTT connection. MQTT over QUIC, with 1 RTT connection establishment and 0 RTT reconnection latency, is indeed faster and has lower latency compared to MQTT over TLS. The QUIC stack can be customized for various use cases, such as keeping connections alive in poor networking conditions and for scenarios where there is a need for low client-to-server latency. It will benefit connected cars with unreliable cellular networks and low-latency industrial IoT applications. The adoption of MQTT over QUIC is expected to play a vital role in the future of IoT, Industrial IoT (IIoT), and the Internet of Vehicles (IoV). EMQX has introduced MQTT over QUIC support in its latest version, 5.0. And like HTTP/3, the next version of the MQTT protocol, MQTT 5.1 or 6.0, will use QUIC as its primary transport layer in the near future. MQTT Serverless The serverless trend in cloud computing marks a groundbreaking paradigm shift in how applications are designed, developed, deployed, and run. This paradigm enables developers to focus on their application's business logic instead of managing infrastructure, resulting in enhanced agility, scalability, and cost-effectiveness. Serverless MQTT broker emerges as a cutting-edge architectural innovation for 2023. In contrast to traditional IoT architectures, which require minutes to hours for creating MQTT-hosted services on the cloud or deploying them on-premises, serverless MQTT enables rapid deployment of MQTT services with just a few clicks. Moreover, the true value proposition of serverless MQTT lies not in its deployment speed, but in its unparalleled flexibility. This flexibility manifests in two key aspects: the seamless scaling of resources in response to user demands and the pay-as-you-go pricing model that aligns with this elastic architecture. As a result, serverless MQTT is poised to drive broader adoption of MQTT, reducing operational costs and spurring innovation and collaboration across diverse industries. We might even see a free serverless MQTT broker for every IoT and Industrial IoT developer. In March 2023, EMQX Cloud launched the world's first serverless MQTT service, offering users not only an incredibly fast deployment time of just 5 seconds but also the exceptional flexibility that truly sets serverless MQTT apart. MQTT Multi-Tenancy Multi-tenancy architecture is a vital aspect of a serverless MQTT broker. IoT devices from different users or tenants can connect to the same large-scale MQTT cluster while keeping their data and business logic isolated from other tenants. SaaS applications commonly use multi-tenancy architecture, where a single application serves multiple customers or tenants. There are usually two different ways to implement multi-tenancy in SaaS, such as: Tenant Isolation: A separate application instance is provided to each tenant, running on a server or virtual machine. Database Isolation: Multiple tenants can share a single application instance, but each tenant has their database schema to ensure data isolation. In the multi-tenancy architecture of the MQTT broker, each device and tenant is given a separate and isolated namespace. This namespace includes a unique topic prefix and access control lists (ACLs) that define which topics each user can access, publish to, or subscribe to. MQTT broker with multi-tenancy support will reduce management overhead and allow greater flexibility for complex scenarios or large-scale IoT applications. For example, departments and applications in a large organization could use the same MQTT cluster as different tenants. MQTT Sparkplug 3.0 MQTT Sparkplug 3.0 is the latest version of the MQTT Sparkplug, the open standard specification designed by Eclipse Foundation. It defines how to connect industrial devices, including sensors, actuators, Programmable Logic Controllers (PLCs), and gateways using the MQTT messaging protocol. MQTT Sparkplug 3.0 was released in November 2022 with some key new features and improvements: MQTT 5 Support: MQTT Sparkplug 3.0 supports the MQTT 5 protocol, which includes several new features such as shared subscriptions, message expiry, and flow control. Optimized Data Transmission: MQTT Sparkplug 3.0 includes several optimizations for data transmission, including the use of more compact data encoding and compression algorithms. Expanded Data Model: MQTT Sparkplug 3.0 introduces an expanded data model, which allows for more detailed device information to be communicated, as well as additional information such as configuration data and device metadata. Improved Security: MQTT Sparkplug 3.0 includes several improvements to security, including support for mutual TLS authentication and improved access control mechanisms. Simplified Device Management: MQTT Sparkplug 3.0 includes several improvements to device management, including automatic device registration and discovery, simplified device configuration, and improved diagnostics. MQTT Sparkplug aimed to simplify connecting and communicating with disparate industrial devices and achieve efficient industrial data acquisition, processing, and analysis. As the new version is released, MQTT Sparkplug 3.0 has the potential to be more widely adopted in the Industrial IoT. MQTT Unified Namespace Unified Namespace is a solution architecture built on the MQTT broker for Industrial IoT and Industry 4.0. It provides a unified namespace for MQTT topics and a centralized repository for messages and structured data. Unified Namespace connects industrial devices, sensors, and applications, such as SCADA, MES, and ERP, with star topology using a central MQTT broker. Unified Namespace dramatically simplifies the development of industrial IoT applications with an event-driven architecture. In traditional IIoT systems, OT and IT systems have generally been separate and operated independently with their data, protocols, and tools. By adopting Unified Namespace, it is possible to allow OT and IT systems to exchange data more efficiently and finally unify the OT and IT in the IoT era. In 2023, with EMQX or NanoMQ MQTT broker empowered by Neuron Gateway, the latest open source IIoT connectivity server, building a UNS architecture empowered by the most advanced technology from the IT world is just within grasp. MQTT Geo-Distribution MQTT Geo-Distribution is an innovative architecture that allows MQTT brokers deployed in different regions or clouds to work together as a single cluster. Using Geo-Distribution, MQTT messages can be automatically synchronized and delivered across MQTT brokers in different regions. In 2023, we can expect two approaches to implementing MQTT Geo-Distribution: Single Cluster, Multi-Region: A single MQTT cluster with brokers running in different regions. Multi-Cluster, Multi-Cloud: Multiple MQTT clusters connected with Cluster Linking in different clouds. We can combine the two approaches to create a reliable IoT messaging infrastructure across geographically distributed MQTT brokers. By adopting the MQTT Geo-Distribution, organizations can build a Global MQTT Access Network across multi-cloud, where devices and applications connected locally from the closest network endpoint can communicate with each other regardless of their physical location. MQTT Streams MQTT Streams is an expected extension of the MQTT protocol that enables the handling of high-volume, high-frequency data streams in real time within an MQTT broker. This feature enhances traditional MQTT brokers' capabilities, which were originally designed for lightweight publish/subscribe messaging. With MQTT Streams, clients can produce and consume MQTT messages as streams, similar to how Apache Kafka works. This allows for historical message replay, which is essential for event-driven processing, ensuring ultimate data consistency, auditing, and compliance. Stream processing is crucial for extracting real-time business value from the massive amounts of data generated by IoT device sensors. Previously, this required an outdated, complex big data stack involving the integration of an MQTT broker with Kafka, Hadoop, Flink, or Spark for IoT data stream processing. However, with built-in stream processing, MQTT Streams streamlines the IoT data processing stack, improves data processing efficiency and response time, and provides a unified messaging and streaming platform for IoT. By supporting features such as message deduplication, message replay, and message expiration, MQTT Streams enables high throughput, low latency, and fault tolerance, making it a powerful tool for handling real-time data streams in MQTT-based IoT applications. Conclusion Overall, these 7 trends in MQTT technology reflect the progress of emerging technologies and their role in advancing the IoT. As a standard messaging protocol evolved for over two decades, MQTT’s importance continues to grow. With the increasing use of IoT in various industries, the MQTT protocol is evolving to meet new challenges and demands, such as faster and lower-latency connections, more rapid deployment of MQTT services, greater flexibility for complex scenarios or large-scale IoT applications, and more support on connecting various industrial devices. With these developments, MQTT will become the nerve system of IoT and an even more crucial player in IIoT and IoV (Internet of Vehicles) in 2023 and beyond.
Any trustworthy data streaming pipeline needs to be able to identify and handle faults. Exceptionally while IoT devices ingest endlessly critical data/events into permanent persistence storage like RDBMS for future analysis via multi-node Apache Kafka cluster. (Please click here to read how to set up a multi-node Apache Kafka Cluster). There could be scenarios where IoT devices might send fault/bad events due to various reasons at the source points, and henceforth appropriate actions can be executed to correct it further. The Apache Kafka architecture does not include any filtering and error handling mechanism within the broker so that maximum performance/scale can be yielded. Instead, it is included in Kafka Connect, which is an integration framework of Apache Kafka. As a default behavior, if a problem arises as a result of consuming an invalid message, the Kafka Connect task terminates, and the same applies to JDBC Sink Connector. Kafka Connect has been classified into two categories, namely Source (to ingest data from various data generation sources and transport to the topic) and Sink (to consume data/messages from the topic and send them eventually to various destinations). Without implementing a strict filtering mechanism or exception handling, we can ingest/publishes messages inclusive of wrong formatted to the Kafka topic because the Kafka topic accepts all messages or records as byte arrays in key-value pairs. But by default, the Kafka Connect task stops if an error occurs because of consuming an invalid message, and on top of that JDBC sink connector additionally won’t work if there is an ambiguity in the message schema. The biggest difficulty with the JDBC sink connector is that it requires knowledge of the schema of data that has already landed on the Kafka topic. Schema Registry must therefore be integrated as a separate component with the exiting Kafka cluster to transfer the data into the RDBMS. Therefore, to sink data from the Kafka topic to the RDBMS, the producers must publish messages/data containing the schema. You could read here to learn streaming Data via Kafka JDBC Sink Connector without leveraging Schema Registry from the Kafka topic. Since Apache Kafka 2.0, Kafka Connect has integrated error management features, such as the ability to reroute messages to a dead letter queue. In the Kafka cluster, a dead letter queue (DLQ) is a straightforward topic that serves as the destination for messages that, for some reason, were unable to reach their intended recipients, especially for JDBC sink connector, tables in RDBMS There might be two major reasons why the JDBC Kafka sink connector stops working abruptly while consuming messages from the topic: Ambiguity between data types and the actual payload Junk data in payload or wrong schema There is no complicacy of DLQ configuration in the JDBC sink connector. The following parameters need to be added in the sink configuration file (.properties file): errors.tolerance=allerrors.deadletterqueue.topic.name= <<Name of the DLQ Toic>>errors.deadletterqueue.topic.replication.factor= <<No of replication>>Note:- No of replication should be equal or less then the number of Kafka broker in the cluster. The DLQ topic would be created automatically with the above-mentioned replication factor when we start the JDBC sink connector for the first time. When an error occurs, or bad data is encountered by the JDBC sink connector while consuming messages from the topic, these unprocessed messages/bad data would be forwarded straightly to the DLQ. Subsequently, correct messages or data will send to the respective RDBMS tables continuously and again in between. If bad messages are encountered, then the same would be forwarded to the DLQ and so on. After landing the bad or erroneous messages on the DLQ, we will have two options either manually introspect each message to understand the root cause of the error or implement a mechanism to reprocess the bad messages and push them eventually to the consumers for JDBC sink connector the destination should be RDBMS tables. Dead letter queues are not enabled by default in Kafka Connect due to the above reason. Even though Kafka Connect supports several error-handling strategies, such as dead letter queues, silently ignoring, and failing quickly, the adoption of DLQ would be the best approach while configuring the JDBC sink connector. Decoupling completely the bad/error messages handling from the normal messages/data transportation from the Kafka topic would boost the overall efficiency of the entire system as well as allow the development team to develop an independent error handling mechanism from easy maintainability perspectives. Hope you have enjoyed this read. Please like and share if you feel this composition is valuable.
Authentication is the process of identifying a user and verifying that they have access to a system or server. It is a security measure that protects the system from unauthorized access and guarantees that only valid users are using the system. Given the expansive nature of the IoT industry, it is crucial to verify the identity of those seeking access to its infrastructure. Unauthorized entry poses significant security threats and must be prevented. And that's why IoT developers should possess a comprehensive understanding of the various authentication methods. Today, I'll explain how authentication works in MQTT, what security risks it solves, and introduce the first authentication method: password-based authentication. What Is Authentication in MQTT? Authentication in MQTT refers to the process of verifying the identity of a client or a broker before allowing them to establish a connection or interact with the MQTT network. It is only about the right to connect to the broker and is separate from authorization, which determines which topics a client is allowed to publish and subscribe to. The authorization will be discussed in a separate article in this series. The MQTT broker can authenticate clients mainly in the following ways: Password-based authentication: The broker verifies that the client has the correct connecting credentials: username, client ID, and password. The broker can verify either the username or client ID against the password. Enhanced authentication (SCRAM): This authenticates the clients using a back-and-forth challenge-based mechanism known as Salted Challenge Response Authentication Mechanism. Other methods include Token Based Authentication like JWT, and also HTTP hooks, and more. In this article, we will focus on password-based authentication. Password-Based Authentication Password-based authentication aims to determine if the connecting party is legitimate by verifying that he has the correct password credentials. In MQTT, password-based authentication generally refers to using a username and password to authenticate clients, which is also recommended. However, in some scenarios, some clients may not carry a username, so the client ID can also be used as a unique identifier to represent the identity. When an MQTT client connects to the broker, it sends its username and password in the CONNECT packet. The example below shows a Wireshark capture of the CONNECT packet for a client with the corresponding values of client1, user, and MySecretPassword. After the broker gets the username (or client ID) and password from the CONNECT packet, it needs to look up the previously stored credentials in the corresponding database according to the username, and then compare it with the password provided by the client. If the username is not found in the database, or the password does not match the credentials in the database, the broker will reject the client's connection request. This diagram shows a broker using PostgreSQL to authenticate the client's username and password. The password-based authentication solves one security risk. Clients that do not hold the correct credentials (Username and Password) will not be able to connect to the broker. However, as you can see in the Wireshark capture, a hacker who has access to the communication channel can easily sniff the packets and see the connect credentials because everything is in plaintext. We will see in a later article in this series how we can solve this problem using TLS (Transport Layer Security). Secure Your Passwords With Salt and Hash Storing passwords in plaintext is not considered secure practice because it leaves passwords vulnerable to attacks. If an attacker gains access to a password database or file, they can easily read and use the passwords to gain unauthorized access to the system. To prevent this from happening, passwords should instead be stored in a hashed and salted format. What is a hash? It is a function that takes some input data, applies a mathematical algorithm to the data, and then generates an output that looks like complete nonsense. The idea is to obfuscate the original input data and also the function should be one-way. That means that there is no way to calculate the input given the output. However, hashes by themselves are not secure and can be vulnerable to dictionary attacks as shown in the following example. Consider this sha256 hash: 8f0e2f76e22b43e2855189877e7dc1e1e7d98c226c95db247cd1d547928334a9 It looks secure; you cannot tell what the password is by looking at it. However, the problem is that for a given password, the hash always produces the same result. So, it is easy to create a database of common passwords and their hash values. Here is an example: A hacker could look up this hash in an online hash database and learn that the password is passw0rd. "Salting" a password solves this problem. A salt is a random string of characters that is added to the password before hashing. This makes each password hash unique, even if the passwords themselves are the same. The salt value is stored alongside the hashed password in the database. When a user logs in, the salt is added to their password, and the resulting hash is compared to the hash stored in the database. If the hashes match, the user is granted access. Suppose that we add a random string of text to the password before we perform the hash function. The random string is called the salt value. For example with a salt value of az34ty1, sha256(passw0rdaz34ty1) is 6be5b74fa9a7bd0c496867919f3bb93406e21b0f7e1dab64c038769ef578419d This is unlikely to be in a hash database since this would require a large number of database hash entries just for the single plaintext passw0rd value. Best Practices for Password-Based Authentication in MQTT Here are some key takeaways from what we’ve mentioned in this article, which can be the best practices for password-based authentication in MQTT: One of the most important aspects of password-based authentication in MQTT is choosing strong and unique passwords. Passwords that are easily guessable or reused across multiple accounts can compromise the security of the entire MQTT network. It is also crucial to securely store and transmit passwords to prevent them from falling into the wrong hands. For instance, passwords should be hashed and salted before storage, and transmitted over secure channels like TLS. In addition, it's a good practice to limit password exposure by avoiding hard-coding passwords in code or configuration files, and instead using environment variables or other secure storage mechanisms. Summary In conclusion, password-based authentication plays a critical role in securing MQTT connections and protecting the integrity of IoT systems. By following best practices for password selection, storage, and transmission, and being aware of common issues like brute-force attacks, IoT developers can help ensure the security of their MQTT networks. However, it's important to note that password-based authentication is just one of many authentication methods available in MQTT, and may not always be the best fit for every use case. For instance, more advanced methods like digital certificates or OAuth 2.0 may provide stronger security in certain scenarios. Therefore, it's important for IoT developers to stay up-to-date with the latest authentication methods and choose the one that best meets the needs of their particular application. Next, I'll introduce another authentication method: SCRAM. Stay tuned for it!
We’ve been hearing that the Internet of Things (IoT) would transform the way we live and work by connecting everyday devices to the internet for a long time now. While much of the promise of the IoT always seems to be "coming soon," the proliferation of IoT devices has already created a massive amount of data that needs to be processed, stored, and analyzed, in real-time. I’ve said for years—actually over a decade now—that if your IoT data isn’t timely, accurate, and actionable, you’re mostly wasting your time in collecting it. This is where the Apache Pinot® database comes in. Pinot is an open-source, distributed data store designed for real-time analytics. The high scalability, reliability, and low latency query response times of Pinot make it a great solution for processing massive amounts of IoT data. In this post, we will explore the benefits of using Pinot in IoT applications. IoT devices generate a massive amount of data, and traditional databases are not equipped to handle the scale and complexity. I’ve used a lot of solutions to collect, store and analyze IoT data, but Pinot is specifically designed for handling high-velocity data streams in real-time. With Pinot, IoT data can be ingested, processed, and analyzed in real-time. In addition to real-time processing, Pinot offers scalability and reliability. As the number of IoT devices and the amount of data they generate continues to grow, it becomes critical to have a system that can scale horizontally to handle the increasing load. Pinot can scale easily by adding more nodes to the cluster, and it also provides fault tolerance, ensuring that data is not lost in the event of a node failure. Some Background What Is IoT? If we’re going to talk about IoT and Pinot, it’s probably best to give at least a bit of context on what IoT actually is and is not. IoT, short for the Internet of Things, refers to a network of physical devices, vehicles, home appliances, and other items embedded with sensors, software, and network connectivity. These devices can communicate with each other and share data over the internet. The range of IoT devices is diverse, ranging from smartwatches and fitness trackers to smart home devices like thermostats and security cameras to industrial machines and city infrastructure. The IoT market is expected to grow rapidly in the coming years, with estimates suggesting that there will be over 27 billion IoT devices by 2025. The significance of IoT lies in the ability to collect and analyze data from a wide range of sources in real-time. This data can be used to gain insights, optimize processes, improve decision-making, and enhance user experiences. For example, in the healthcare industry, IoT devices can monitor vital signs and other health metrics, alerting doctors or caregivers in case of abnormal readings. In the retail industry, IoT sensors can track inventory levels and customer behavior, enabling retailers to optimize store layouts and product offerings. Some retail establishments are already using IoT devices to handle increases or decreases in customer traffic in stores. In the transportation industry, IoT devices can monitor vehicle performance and location, enabling fleet managers to improve efficiency and safety. Most modern cars are already equipped with IoT devices that can monitor and report on a wide range of vehicle metrics, including fuel consumption, tire pressure, and engine performance, and almost all over-the-road trucks are already reporting vast amounts of telemetry data to their fleet managers. What Is Apache Pinot? Pinot is an open-source distributed data store that is purpose-built for real-time analytics. Originally developed at LinkedIn, Pinot has since become an Apache Software Foundation project and is used by a growing number of companies and organizations for a variety of use cases. Pinot is designed to handle large volumes of data in real-time and provides sub-second query latencies, making it ideal for use cases that require real-time analytics, such as IoT. One of the key features of Pinot is its distributed architecture. Pinot is designed to be horizontally scalable, which means that it can handle increasing amounts of data by adding more nodes to the cluster. This distributed architecture also provides fault tolerance, which means that it can continue to function even if one or more nodes in the cluster fail. Pinot stores data in columnar format, which allows for highly efficient querying and analysis. By storing data in columns rather than rows, Pinot can quickly scan through large amounts of data and provide compute aggregations or other complex calculations required for IoT data analysis. Pinot provides support for a variety of data types, including numerical, text, JSON, and geospatial data. It allows for nested queries, which can be useful for analyzing complex IoT data sets, and an emerging feature of generalized joins will make these query options even more powerful. Overall, Pinot is a powerful tool for analyzing and managing IoT data in real time. Advantages of Using Apache Pinot With IoT When it comes to using Pinot with IoT, there are a number of use cases and scenarios where the two technologies can be effectively combined. For example, in the industrial IoT space, Pinot can be used to analyze sensor data from manufacturing equipment to optimize performance and improve efficiency. Analyzing data from industrial equipment in real-time allows for much better predictive maintenance, more efficient usage patterns, and overall better utilization of resources. If you’re going to use Pinot with IoT, the first step is to identify the data sources that will be ingested into Pinot. In reality, you’ll want to back up even further and analyze the types of insights and efficiencies you’re looking for in your deployment. Once you’ve done this, you can begin to design the kind of data you’ll want to collect in order to facilitate those insights. This can include data from sensors, gateways, and other IoT devices. Once the data sources have been identified, Pinot can be configured to ingest the data in real time, processing and analyzing it as it is received. Once you’ve begun to ingest your data into Pinot, you can query it using SQL. With your queries in place, you can start identifying patterns in sensor data that can help detect anomalies in equipment performance and track changes in environmental conditions over time. However, using Apache Pinot with IoT naturally presents data security and privacy challenges. IoT devices are often connected to sensitive systems or contain personal data, making it important to ensure that data is properly secured and protected. Organizations need to implement robust security measures to protect against unauthorized access and data breaches. Another challenge of using Pinot with IoT is the complexity of the data sets involved. IoT data can be highly complex and heterogeneous, consisting of a variety of data types and formats. This can make it difficult to analyze and extract insights from the data. Organizations need to have a clear understanding of the data they are working with and develop effective data management and analysis strategies to overcome these challenges. Despite these challenges, the benefits of using Pinot with IoT make it a powerful tool for organizations looking to leverage their IoT data. With its real-time analytics capabilities, distributed architecture, and support for complex queries, Pinot is well-suited for managing and analyzing the vast amounts of data generated by IoT devices. By implementing effective data management and security strategies, organizations can unlock the full potential of their IoT data and drive innovation and growth in their respective industries. Use Cases of Apache Pinot With IoT There are various use cases of Pinot with IoT, ranging from predictive maintenance in manufacturing to healthcare monitoring and analysis. Below are some detailed examples of how Pinot can be used in different IoT applications: Predictive maintenance in manufacturing: One of the most promising applications of Pinot in IoT is predictive maintenance in manufacturing. By collecting and analyzing real-time data from sensors and machines, Pinot can help predict when a machine is likely to fail and schedule maintenance before a breakdown occurs. This can improve equipment uptime and reduce maintenance costs. Smart city monitoring and management: Smart city applications are a rapidly expanding use case for IoT. Smart city data from sensors and devices are used to manage various aspects of city infrastructure such as traffic, parking, and waste management. Pinot can help analyze real-time data from multiple sources and provide insights that can be used to optimize city operations and improve citizen services. Real-time tracking and monitoring of vehicles: Another use case of Pinot in IoT is the monitoring and management of fleet vehicles. Pinot can be used to collect and analyze data from GPS trackers, vehicle sensors, and cameras to provide real-time insights into vehicle location, speed, and driving behavior. Combined with Smart City data such as real-time traffic insights, fleet managers can optimize routes, reroute deliveries, and optimize for external factors in real-time. This can help optimize fleet management and improve driver safety. Healthcare monitoring and analysis: Healthcare applications, where data from wearables, sensors, and medical devices can be used to monitor patients and analyze health outcomes in order to improve patient care and reduce errors. Conclusion I hope I have shown you how Pinot can provide you with a powerful toolset for managing and analyzing IoT data in real time. Its distributed architecture and fault-tolerant design make it an ideal choice for organizations looking to scale their data storage and processing capabilities as their IoT data grows. With its support for complex queries and SQL-like query language, Pinot offers a flexible and powerful platform for analyzing complex IoT data sets. As the IoT continues to grow and evolve, Pinot is well-positioned to become an increasingly important tool for managing and analyzing IoT data in real time. By embracing this technology and developing effective strategies for managing and analyzing IoT data, organizations can stay ahead of the curve and unlock new opportunities for growth and innovation. Try It Out Yourself Interested in seeing if Apache Pinot is a possible solution for you? Come join the community of users who are implementing Apache Pinot for real-time data analytics. Want to learn even more about it? Then be sure to attend the Real Time Analytics Summit in San Francisco!
With growing concern regarding data privacy and data safety today, Internet of Things (IoT) manufacturers have to up their game if they want to maintain consumer trust. This is the shared goal of the latest cybersecurity standard from the European Telecommunications Standards Institute (ETSI). Known as ETSI EN 303 645, the standard for consumer devices seeks to ensure data safety and achieve widespread manufacturer compliance. So, let’s dive deeper into this standard as more devices enter the home and workplace. The ETSI Standard and Its Protections It counts a long name but heralds an important era of device protection. ETSI EN 303 645 is a standard and method by which a certifying authority can evaluate IoT device security. Developed as an internationally applicable standard, ETSI offers manufacturers a baseline for security rather than a comprehensive set of precise guidelines. The standard may also lay the groundwork for various future IoT cybersecurity certifications in different regions around the world. For example, look at what’s happening in the European Union. Last September, the European Commission introduced a proposed Cyber Resilience Act, intended to protect consumers and businesses from products with inadequate security features. If passed, the legislation — a world-first on connected devices — will bring mandatory cybersecurity requirements for products with digital elements throughout their whole lifecycle. The prohibition of default and weak passwords, guaranteed support of software updates and mandatory testing for security vulnerabilities are just some of the proposals. Interestingly, these same rules are included in the ETSI standard. IoT Needs a Cybersecurity Standard Shockingly, a single home filled with smart devices could experience as many as 12,000 cyber attacks in a single week. While most of those cyber attacks will fail, the sheer number means some inevitably get through. The ETSI standard strives to keep those attacks out with basic security measures, many of which should already be common sense, but unfortunately aren’t always in place today. For example, one of the basic requirements of the ETSI standard is no universal default passwords. In other words, your fitness tracker shouldn’t have the same default password as every other fitness tracker of that brand on the market. Your smart security camera shouldn’t have a default password that anyone who owns a similar camera could exploit. It seems like that would be common sense for IoT manufacturers, but there have been plenty of breaches that occurred simply because individuals didn’t know to change the default passwords on their devices. Another basic requirement of ETSI is allowing individuals to delete their own data. In other words, the user has control over the data a company stores about them. Again, this is pretty standard stuff in the privacy world, particularly in light of regulations like Europe’s General Data Protection Regulation (GDPR) and California’s Consumer Privacy Act (CCPA). However, this is not yet a universal requirement for IoT devices. Considering how much health- and fitness-related data many of these devices collect, consumer data privacy needs to be more of a priority. Several more rules in ETSI have to do with the software installed on such devices and how the provider manages security for the software. For example, there needs to be a system for reporting vulnerabilities. The provider needs to keep the software up to date and ensure software integrity. We would naturally expect these kinds of security measures for nearly any software we use, so the standard is basically just a minimum for data protection in IoT. Importantly, the ETSI standard covers pretty much everything that could be considered a smart device, including wearables, smart TVs and cameras, smart home assistants, smart appliances, and more. The standard also applies to connected gateways, hubs, and base stations. In other words, it covers the centralized access point for all of the various devices. Why Device Creators Should Implement the Standard Today Just how important is the security standard? Many companies are losing customers today due to a lack of consumer trust. There are so many stories of big companies like Google and Amazon failing to adequately protect user data, and IoT in particular has been in the crosshairs multiple times due to privacy concerns. An IoT manufacturer that doesn’t want to lose business, face fines and lawsuits, and damage the company's reputation should consider implementing the ETSI standard as a matter of course. After all, these days a given home might have as many as 16 connected devices, each an entry point into the home network. A company might have one laptop per employee but two, three, or more other smart devices per employee. And again, each smart device is a point of entry for malicious hackers. Without a comprehensive cybersecurity standard like ETSI EN 303 645, people who own unprotected IoT devices need to worry about identity theft, ransomware attacks, data loss and much more. How to Test and Certify Based on ETSI Certification is fairly basic and occurs in five steps: Manufacturers have to understand the 33 requirements and 35 recommendations of the ETSI standard and design devices accordingly. Manufacturers also have to buy an IoT platform that has been built with the ETSI standard in mind, since the standard will fundamentally influence the way the devices are produced and how they operate within the platform. Next, any IoT manufacturer trying to meet the ETSI standard has to fill out documents that provide information for device evaluation. The first document is the Implementation Conformance Statement, which shows which requirements and recommendations the IoT device does or doesn’t meet. The second is the Implementation eXtra Information for Testing, which provides design details for testing. A testing provider will next evaluate and test the product based on the two documents and give a report. The testing provider will provide a seal or other indication that the product is ETSI EN 303 645-compliant. With new regulations on the horizon, device manufacturers and developers should see it as best practice to get up to speed with this standard. Better cybersecurity is not only important for consumer protection but brand reputation. Moreover, this standard can provide a basis for stricter device security certifications and measures in the future. Prepare today for tomorrow.
While a lot of my inspiration for blog posts come from talking with New Relic users, it's hard to share them as examples because their so specific and often confidential. So I find myself struggling more to find a generic "for instance" that's easy to understand and accessible to all everyone. Which should explain why I use my home environment as the sample use case so often. Even if you don't have exactly the same gear or setup I do, it's likely you have something analogous. On top of that, if you don't have the specific element I'm discussing, many times I believe it's something you ought to consider. That brings us to my example today: Pi-Hole. Pi-Hole acts as a first-level DNS server for your network. But what it REALLY does is make your network faster and safer by blocking requests to malicious, unsavory, or just plain obnoxious sites. If you’re using Pi-Hole, it’ll be most noticeable in the ways advertisements on a webpage load. BEFORE: pop-overs and hyperbolic ads AFTER: No pop-overs, spam ads blocked But under the hood, it’s even more significant. BEFORE: 45 seconds to load AFTER: 6 seconds to load Look in the lower-right corner of each of those images. Load time without Pi-Hole was over 45 seconds. With it, the load time was 6 seconds. You may there are many pages like this, but the truth is web pages link to these sites all the time. Here's the statistics from my house on a typical day. How Does the Pi-Hole API Work? If you have Pi-Hole running, you get to the API by going to http://<your pi-hole url>/admin/api.php?summaryRaw. The result will look something like this: {”domains_being_blocked”:115897,”dns_queries_today”:284514,”ads_blocked_today”:17865,”ads_percentage_today”:6.279129,”unique_domains”:14761,”queries_forwarded”:216109,”queries_cached”:50540,”clients_ever_seen”:38,”unique_clients”:22,”dns_queries_all_types”:284514,”reply_NODATA”:20262,”reply_NXDOMAIN”:19114,”reply_CNAME”:16364,”reply_IP”:87029,”privacy_level”:0,”status”:”enabled,””gravity_last_updated”:{”file_exists”:true,”absolute”:1567323672,”relative”:{”days”:”3,””hours”:”09,””minutes”:”53”}} Let's format the JSON data so it looks a little prettier: The point is, once we have access to all that JSON-y goodness, it's almost trivial (using the Flex integration, which I discussed in this series) to collect and send into New Relic, to provide further insight into how your network is performing. At that point, you can start to include the information in graphs like this: Assuming you have the New Relic infrastructure agent installed on on any system on the network that can access your pihole (and once again, if you need help getting that set up, check out my earlier blog post here) you have relatively few steps to get up and running. First, the YAML file would look like this (you can also find it on the New Relic Flex GitHub repo in the examples folder). integrations: - name: nri-flex config: name: pihole_simple apis: - name: pihole_simple url: http://pi.hole/admin/api.php?summaryRaw&auth= #<your API Key Here> headers: accept: application/json remove_keys: - timestamp Next, the NRQL you'd need to set up two different charts are as follows: For the "Query Volume" chart: From pihole_simpleSample SELECT average(dns_queries_all_replies), average(dns_queries_today), average(queries_forwarded), average(queries_cached), average(dns_queries_all_types) TIMESERIES For the "Blocking Activity" chart: From pihole_simpleSample SELECT average(ads_blocked_today), average(domains_being_blocked) TIMESERIES This is, of course, only the start of the insights you can gain from your Pi-Hole server (and by extension, ANY device or service that has an API with endpoints that provide data). If you find additional use cases, feel free to reach out to me in the comments below, on social media, or when you see me at a conference or meet-up.
Frank Delporte
Java Developer - Technical Writer,
CodeWriter.be
Tim Spann
Principal Developer Advocate,
Cloudera
Carsten Rhod Gregersen
Founder, CEO,
Nabto
Emily Newton
Editor-in-Chief,
Revolutionized