The combination of IoT data, streaming analytics, machine learning, and distributed computing has become more powerful and less expensive than before, enabling the storage and analysis of more data and many different types of data much faster.
- Health care: continuous monitoring of chronic diseases
- Smart Cities: traffic patterns and congestion management
- Manufacturing: optimization and predictive maintenance
- Transportation: optimizing routes and fuel consumption
- Automobile: smart cars
- Telecom: anomaly detection
- Retail: location-based advertising
To understand why combining IoT, streaming data, and machine learning would benefit health care, it’s important to note that chronic diseases—such as heart disease—are the major causes of sickness and health care costs in the nation. The biggest areas of spending and concern are for coordination of care and preventing hospital admissions for people with chronic conditions. Cheaper sensors that can monitor vital signs combined with machine learning, are making it possible for doctors to rapidly apply smart medicine to their patients’ cases, and has the potential to enable scalable chronic disease management with better care at lower costs.
A team of researchers at Stanford University has shown that a machine-learning model can identify heart arrhythmias from an electrocardiogram (ECG) better than an expert.
As explained by Michael Chui of the McKinsey Global Institute, “Sensors placed on the patient can now monitor vital signs remotely and continuously, giving practitioners early warning of conditions that would otherwise lead to unplanned hospitalizations and expensive emergency care. Better management of congestive heart failure alone could reduce hospitalization and treatment costs by a billion dollars annually in the United States.”
Data from monitors can be analyzed in real time and send alerts to care providers so they know instantly about changes in a patient’s condition. However, it is crucial with anomaly detection to find real problems while keeping false alarms low; a patient at UCSF barely survived a 39-fold overdose after alerts were ignored. The alert for a 39-fold overdose and the alert for a 1% overdose looked exactly the same, and the doctors and pharmacists receiving too many alerts learned not to pay attention.
This post will discuss a streaming machine-learning application to detect anomalies in data from a heart monitor, demonstrating an example of how new digital connected health technologies could be used. We will also go over a technique which allows controllable accuracy for triggering alerts, keeping false alarms low. This application was presented at Strata San Jose by Joe Blue and Carol McDonald and is based on an example from the ebook Practical Machine Learning: A New Look At Anomaly Detection by Ted Dunning and Ellen Friedman. (Links to download the complete code are provided at the end of this post.)
What Is Machine Learning?
Machine learning uses algorithms to find patterns in data, and then uses a model that recognizes those patterns to make predictions on new data.
What Is Anomaly Detection?
Anomaly detection is an example of an unsupervised machine-learning approach.
Unsupervised algorithms do not have a label or target outcome provided in advance. These algorithms find similarities or regularities in the input data–for example grouping similar customers, based on purchase data.
Anomaly detection first establishes what normal behavior is, then compares it to the observed behavior and generates an alert if significant deviations from normal are identified. In this case, we do not start with a known set of heart conditions that we will try to classify. Instead, we look for deviations from a typical reading, and we apply that evaluation in near real time.
Building the Model With Clustering
Cardiologists have defined the waves of a normal EKG pattern; we use this repeating pattern to train a model on previous heartbeat activity and then compare subsequent observations to this model in order to evaluate anomalous behavior.
To build a model of typical heartbeats activity, we process an EKG (based on a specific patient or a group of many patients), break it into overlapping pieces that are about 1/3 sec long, and then apply a clustering algorithm to group similar shapes.
Clustering algorithms discover groupings that occur in collections of data. In clustering, an algorithm classifies inputs into categories by analyzing similarities between input examples. The k-means algorithm groups observations into k clusters in which each observation belongs to the cluster with the nearest mean from its cluster center.
In the Apache Spark code below, we:
- Parse the EKG data into a vector.
- Create a k-means object and set the parameters to define the number of clusters and the maximum number of iterations to determine the clusters.
- Train the model on the input data.
- Save the model to use later.
This results in a catalog of shapes, which can be used for reconstructing what an EKG should look like:
Using the Model of Normal With Streaming Data
In order to compare the actual EKG to the model for normal behavior, as the signal arrives the overlapping sequence of shapes, shown in green, are matched against the catalog of shapes, shown in red, and then added together to get a real-time reconstruction of what the typical EKG should look like. (To reconstruct with overlapping pieces, we multiply by a sine-based windowing function.)
In the Apache Spark code below, we:
- Use the DStream foreachRDD method to apply processing to each RDD in this DStream.
- Parse the EKG data into a vector.
- Use the model of clustered EKG window shapes to get the cluster for this window.
- Create a message with the cluster id, the 32 actual EKG data points, and the 32 reconstructed EKG data points.
- Send the enriched message to another MapR-ES topic.
Displaying the Actual EKG and Reconstructed Normal EKG in a Real-Time Dashboard
A real-time web application displays the actual EKG and reconstructed normal EKG, using Vert.x, a toolkit for building reactive event-driven microservices. In the web application:
- A Vert.x Kafka client consumes the enriched EKG messages from the MapR-ES topic and publishes the messages on a Vert.x event bus.
The difference between the observed and expected EKG (the green minus the red) is the reconstruction error, or residual (shown in yellow). If the residual is high, then there could be an anomaly.
The goal of anomaly detection is to find real problems while keeping false alarms low; the challenge is to know what size reconstruction error should trigger an alert.
T-digest is a technique for assessing the size of the reconstruction error as a quantile, based on the distribution of the data set. The algorithm can be added to the anomaly detection workflow so that you can set the number of alarms as a percentage of the total observations. T-digest estimates a distribution very accurately with a modest amount of samples, especially at the tails (which are typically the most interesting parts). And by estimating these accurately, you can set the threshold for generating an alert. For example, setting the threshold at 99% will result in approximately one alert for every reconstruction, which will result in a relatively large number of alerts (anomalies, by definition, should be rare). At 99.9%, an alert would be generated for every one thousand reconstructions.
This post walked you through a streaming system to detect anomalies in data from a heart monitor, demonstrating how data from the monitor flows to an auto-encoder model that compares signals in the context of recent history to detect irregular heartbeats in near real time. This is an example of how combining IoT, streaming data, and machine learning with visualization and alerting could enable healthcare professionals to improve outcomes and reduce costs.
The IoT landscape requires businesses to collect data, aggregate, and learn across a whole population of devices to understand events and situations. At the same time, according to MapR’s Jack Norris, businesses need to inject intelligence to the edge so they can react to those events very quickly. Having a common data fabric can help handle all of the data in the same way, control access to the data, and apply intelligence in a high performance and fast way.
References and More Information
- Practical Machine Learning: A New Look At Anomaly Detection
- Better Anomaly Detection with the T-Digest – Whiteboard Walkthrough
- How t-digest works and why
- Code for t-digest
- EKG anomaly detection example code
- Code for streaming application with real time web EKG display
- Applying Machine Learning to Live Patient Data: Strata Presentation
- Anomaly Detection in Telecommunications Using Complex Streaming Data | Whiteboard Walkthrough
- 14 Benefits and Forces That Are Driving The Internet of Things