What the IoT Needs Is a Data Layer
What the IoT Needs Is a Data Layer
IoT is largely about data, so let's take a look at some of the data-related challenges that are specific to IoT — as well as how to solve them with time series databases.
Join the DZone community and get the full member experience.Join For Free
Read why times series is the fastest growing database category.
When people talk about the IoT market, it is almost always in terms of the number of devices that are, or will be, connected to the internet. 20 Billion devices by 2020, <insert billions> of devices by 2025. That's a fun way to look at the growth of the IoT, and it is instructive in many ways. But unless you're a device manufacturer, or in some other way involved in the development or deployment of these devices, it's only a very small part of the story. So let's talk about the rest of the story for a minute.
The Market: Devices vs. Data
Yes, there will be billions and billions of devices connected to the internet over the next 5 - 15 years. They will be embedded in everything from kitchen toasters to automobiles to fenceposts. In truth, they already are embedded in many of these objects. But the point of each and every one of these devices is to generate data. Data that can be gathered, analyzed, and acted upon. (Remember my mantra: IoT data must be actionable!)
The overall size of the IoT market is tremendous.
The statistic shows the market size of the global Internet of Things (IoT) market from 2009 to 2014, with forecasts from 2015 to 2019. In 2013, the global IoT market had a size of 485.6 billion U.S. dollars.
That's $1.7 trillion dollars by next year. In technical terms, that's a truckload of cash. We're going to be bringing billions of new devices online. While it's a hardware problem, at many levels, it is going to be more of a data problem than anything else. Remember, each one of those billions of sensors is generating data and sending it somewhere to be stored, analyzed, and acted upon. That's going to be a LOT of data. A tsunami of data. And it's not going to subside. Ever. In fact, it is only going to grow over time.
It's All About the Data
So the real question becomes how much data, and what are you going to do with it? How are you going to manage the ingestion, analysis and action of all that data?
IDC forecasts that by 2025 the global datasphere will grow to 163 zettabytes (that is a trillion gigabytes). That's ten times the 16.1ZB of data generated in 2016. All this data will unlock unique user experiences and a new world of business opportunities.
That's a lot of data. From the same report:
By 2025, more than a quarter of data created in the global datasphere will be real time in nature, and real-time IoT data will make up more than 95% of this.
A quarter of 163 Zetabytes will be real-time IoT data. A tsunami. Just think about that for a minute. ¼ of all the data will be from a market segment that, just a few short years ago, didn't even exist. And the nature of that data is changing rapidly.
|Data Type||CAGR 2015 to 2025|
|All Data. Includes all data in the global datasphere.||30%|
|Potentially critical. Data that may be necessary for the continued, convenient operation of users’ daily lives||37%|
|Critical. Data known to be necessary for the expected continuity of users’ daily lives.||39%|
|Hypercritical. Data with direct and immediate impact on the health and well-being of users. (Examples include commercial air travel, medical applications, control systems, and telemetry. This category is heavy in metadata and data from embedded systems.)||
A 54% increase in "hypercritical" data. That's the IoT data that must be acted on immediately or Bad Things™ will happen. I was just meeting with a customer last week who kept saying "If X doesn't happen within 2 minutes, the thing will catch fire." And by "catch fire" they meant something along the lines of the BP Oil spill in the Gulf of Mexico. IoT data is going to be that critical. It already is in many industries.
The Global Internet of Things (IoT) Data Management Market size is expected to reach $69.7 billion by 2023, rising at a market growth of 17% CAGR during the forecast period.
Clearly there is a lot of money to be made in IoT Data. I personally think the market for IoT data is being undervalued and under-estimated, but that may just be a personal bias. But the underlying message here is that the really interesting growth associated with the IoT is not the devices-it's the data. After all, the devices only exist to emit data.
Only 35% of the spend will be on IoT hardware. Guess where the rest of it will be spent. Software. And a vast majority of that will be spent on data management software. And that slide is 3 years old.
The major factors driving the growth of the IoT data management market include the modernization of data warehouse architectures and rise in need for data traffic management. The increasing adoption of data encryption for IoT device security and growing data intrusion threats are also some of the factors that are driving the market growth.
The data integration segment is expected have the second largest market size in the IoT data management market during the forecast period.
IoT data integration solutions enable organizations to securely connect, manage, analyze, and integrate real-time IoT data between connected devices and enterprise applications. Companies are focused on improving routine procedures such as outage management, predicting asset performance, and transforming energy data into new services with the help of IoT data integration. Additionally, data integration solutions integrate data for machine driven use cases, such as data mining, machine learning, and deep learning.
IoT data is special. It is time series data. And traditional data tools just aren't good enough because they weren't designed to do the job they are now being asked to do. Sure, you can make some database technology do time series data. You can even put a time series frontend on a traditional RDBMS like PostgreSQL. But that doesn't make it a time series database, and that won't make it perform like one either.
The major vendors in the IoT data management market include International Business Machines (IBM) Corporation (US), PTC Inc. (US), Teradata Corporation (US), Dell Technologies, Inc. (US), Cisco Systems, Inc. (US), SAS Institute Inc. (US), Hewlett Packard Enterprise (HPE) Company (US), Fujitsu Limited (Japan), Oracle Corporation (US), Google Inc. (US), SAP SE (Germany), LogMeIn, Inc. (US), Striim, Inc. (US), Zebra Technologies Corporation (US), LogFuze Inc. (US), InfluxData, Inc. (US), Trustwave Holdings, Inc. (US), and MuleSoft, Inc. (US).
Again, IoT data is different. Standard RDBMS systems are designed to handle the standard CRUD data model (Create, Read, Update, Delete). But that's not how IoT data works. IoT sensor data isn't updated. You don't go back and update the temperature reading from last Tuesday. You're not doing data transactions. You're streaming vast amounts of data into the system, and then handling that data. You're analyzing that data. You're displaying that data in real time. You're reacting to that data in real time. You're triggering actions based on that data.
Traditionally, most transactional systems handle data every few minutes, but with IoT it's a totally different scenario. A device or sensor that monitors humidity, temperature or any other variables, generates data that is required to be handled every millisecond by backend systems. The convergence of IoT Platforms, data management, cloud, and solutions is enabling the further advancement in data analytics which signifies various intangible and tangible benefits from IoT data. The capability to sort data in one format, store it in different format, and successively release it for further analytics, is of vital importance for industry verticals.
What the IoT desperately needs is a data layer. A data platform that is designed for high volume, high efficiency ingestion of sensor data. A data platform that incorporates the ability to display and analyze IoT sensor data in near real time. And a data platform that can be deployed across the entire architecture in order to push collection, and analytics, as close as possible to the point of data generation as possible.
Published at DZone with permission of David G. Simmons , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.