Over a million developers have joined DZone.

Anomaly Detection in Mobile Sensor Data Using ML

DZone's Guide to

Anomaly Detection in Mobile Sensor Data Using ML

This fascinating look at anomaly detection uses IoT sensors to generate data and machine learning to find unusual patterns in that data.

· IoT Zone ·
Free Resource

This post is an excerpt from our solution tutorial – Gather, visualize, and analyze IoT data. The tutorial walks you through setting up an IoT device, gathering mobile sensor data in the Watson IoT Platform, exploring data and creating visualizations and then using advanced machine learning services to analyze data and detect anomalies in the historical data.

So, What Is Anomaly Detection?

Anomaly detection is a technique used to identify unusual patterns that do not conform to expected behavior, called outliers. It has many applications in business, from intrusion detection (identifying strange patterns in network traffic that could signal a hack) to system health monitoring (spotting a malignant tumor in an MRI scan), and from fraud detection in credit card transactions to fault detection in operating environments.

In our day-to-day life, knowingly or unknowingly, We carry an IoT device. It is our mobile phone with inbuilt sensors which provides data from accelerometer and gyroscope. How about saving this sensor data somewhere and detect anomalies in that data?

That sounds like a cool idea. How can we achieve this? Do I need to code an app and ask users to download it from the store? Not required. A simple Node.js application running on a mobile browser will provide us with the sensor data.


This tutorial uses the following IBM Cloud products:

Here’s the flow or architecture diagram,Architecture-2

So, you will create a Node.js application, run that on a browser, store the accelerometer, and gyroscope data to Cloudant NoSQL DB. So, how do I detect anomalies?

Here’s where IBM Data Science Experience comes handy. You will use the Jupyter Notebook that is available in the IBM Data Science Experience service to load your historical data and detect anomalies using z-score. You will start by creating a new project and then import the Jupyter notebook(.ipynb) through a URL.

Anomaly detection will be performed using z-score. Z-score is a standard score that indicates how many standard deviations an element is from the mean. A z-score can be calculated from the following formula: z = (X - µ) / σ where z is the z-score, X is the value of the element, µ is the population mean, and σ is the standard deviation.

Create a New Project

  1. Go to the IBM Cloud Catalog and select Data Science Experience.
  2. Create the service and launch it’s dashboard by clicking Get Started
  3. Create a New Project and enter Detect Anomaly as the Name.
  4. Create and select Object Storage and Spark services. Refreshdefine_storage
  5. Create.

Connection to CloudantDB for data

  1. Click on Assets > + Add to Project > Connection
  2. Select the iot-db Cloudant DB where the device data is stored.
  3. Check the Credentials then click Create

Create a Jupyter(ipynb) Notebook

  1. Click New notebook > From URL
  2. Enter Anomaly-detection-sample for the Name.
  3. Enter https://raw.githubusercontent.com/IBM-Cloud/iot-device-phone-simulator/master/anomaly-detection/Anomaly-detection-DSX.ipynb in the URL.
  4. Create Notebook. Check that the notebook is created with metadata and code.jupyter_notebook_dsx The recommended version for this notebook is Python 2 with Spark 2.1. To update, Kernel > Change kernel. To Trust the notebook, File > Trust Notebook.

Run the Notebook and Detect Anomalies

  1. Select the cell that starts with !pip install --upgrade pixiedust, and then click Run or Ctrl + Enter to execute the code.
  2. When the installation is complete, restart the Spark kernel by clicking the Restart Kernel icon.
  3. In the next code cell, Import your Cloudant credentials to that cell by completing the following steps:
    • Clickdata_icon
    • Select the Connections tab.
    • Click Insert to code. A dictionary called credentials_1″ is created with your Cloudant credentials. If the name is not specified as “credentials_1”, rename the dictionary to credentials_1. credentials_1 is used in the remaining cells.
    • Name that is required for the notebook code to run.
  4. In the cell with the database name (dbName), enter the name of the Cloudant database that is the source of data, for example, iotp_yourWatsonIoTPorgId_DBName_Year-month-day. To visualize data of different devices, change the values of deviceId and deviceTypeaccordingly. You can find the exact database by navigating to your iot-db CloudantDB instance you created earlier > Launch Dashboard.
  5. Save the notebook and execute each code cell one after another or run all (Cell > Run All) and by end of the notebook you should see anomalies for device movement data (oa, ob, and og). You can change the time interval of interest to desired time of the day. Look for start and end values.anomaly_detection_dsx
  6. Along with anomaly detection, the key findings or takeaways from this section are
    • Usage of Spark to prepare the data for visualization.
    • Usage of Pandas for data visualization
    • Bar charts, Histograms for device data.
    • Correlation between two sensors through a Correlation matrix.
    • A box plot for each devices sensor, produced with the Pandas plot function.
    • Density Plots through kernel density estimation (KDE).density_plots_sensor_data.png
machine learning ,iot sensors ,anomaly detection ,iot ,tutorial

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}