Monitoring Real-Time Uber Data Using Apache APIs, Part 3: Real-Time Dashboard Using Vert.x

DZone 's Guide to

Monitoring Real-Time Uber Data Using Apache APIs, Part 3: Real-Time Dashboard Using Vert.x

The third post in the series on doing real-time analysis/visualization of Ubers discusses building a real–time dashboard to visualize the cluster data on a Google map.

· AI Zone ·
Free Resource

This is Part 3 of a 4-part series. Be sure to check out Part 1 and Part 2 first! 

According to Gartner, smart cities will be using about 1.39 billion connected cars, IoT sensors, and devices by 2020. The analysis of location and behavior patterns within cities will allow optimization of traffic, better planning decisions, and smarter advertising. One of the 10 major areas in which big data is currently being used to excellent advantage is in improving cities. For example, the analysis of GPS car data can allow cities to optimize traffic flows based on real-time traffic information. Telecom companies are using mobile phone location data to provide insights by identifying and predicting the location activity trends and patterns of a population in a large metropolitan area. The application of machine learning to geolocation data is being used in telecom, travel, marketing, and manufacturing to identify patterns and trends, for services such as recommendations, anomaly detection, and fraud.

This is the third in a series of blogs discussing the architecture of an end-to-end application that combines streaming data with machine learning to do real-time analysis and visualization of where and when Uber cars are clustered in order to predict and visualize the most popular Uber locations.

Picture 1

Handling huge amounts of real-time data puts high demands on application architecture. Uber and others have moved from a monolithic to an event-driven microservices architecture because they needed to scale. In this post, we will go over the implementation of a real-time web application using Vert.x, a toolkit for building reactive event-driven microservices.

The first part of this series discusses creating a machine learning model using the Apache Spark K-means algorithm to cluster Uber data by location.

Picture 2

Clustering algorithms group items into categories by analyzing similarities between input examples and discovering groupings that occur in collections of data. Clustering algorithms can be used for:

  • Customer segmentation.
  • Finding trends and detecting anomalies.
  • Grouping search results or similar articles.

The K-means algorithm groups observations into K clusters in which each observation belongs to the cluster with the nearest mean from its cluster center. Below, the output of the model cluster centers, returned from the analysis of the Uber data (with K=10) are displayed on a Google map:

Picture 3

The second post discusses using the saved K-means model with streaming data to do real-time analysis of where and when Uber cars are clustered.

Picture 4

This third post discusses building a real–time dashboard to visualize the cluster data on a Google map. The following figure depicts the data pipeline:

  • Uber trip data is published to a MapR Streams topic using the Kafka API.
  • A Spark streaming application, subscribed to the first topic, enriches the event with the cluster location and publishes the results in JSON format to another topic.
  • A Vert.x web application, subscribed to the second topic, displays the Uber trip clusters in a heat map.

Picture 5

The Vert.x Toolkit and Web Application Architecture

The Vert.x toolkit is event-driven, using an event bus to distribute events to work handler services called verticles. Vert.x, similar to Node.js, employs a non-blocking model with a single threaded event-loop to handle work. The Vert.x SockJS event bus bridge allows web applications to communicate bi-directionally with the Vert.x event bus using Websockets, which allows you to build real-time web applications with server push functionality.

Picture 6

Looking into more detail at the Uber dashboard application architecture:

  • A Vert.x Kafka client verticle consumes messages from the MapR Streams topic and publishes the messages on a Vert.x event bus.
  • A JavaScript browser client subscribes to the Vert.x event bus using SockJS and displays the Uber trip locations on a Google Heatmap.

Picture 7

The Dashboard Vert.x Service

In the Vert.x service code snippet below, we:

  • Create a vertx instance, which provides access to the Vert.x core API.
  • Create a Router object, which routes HTTP request URLs to handlers.
  • Create a BridgeOptions object and specify that messages with the address “dashboard” should pass through the event bus bridge.
  • Route paths that match /eventbus/* to be associated with an event bus bridge SockJSHandler, which extends the server-side Vert.x event bus into client side JavaScript.
  • Create an HttpServer object, an HTTP server implementation.
  • Tell the server to listen on the configured port for incoming requests.

Picture 8

In the code snippet below, messages are consumed from the MapR Streams Uber topic and published to the Vert.x event bus address “dashboard.” Messages will be delivered to all handlers subscribed to this address.

Picture 9

The Dashboard Vert.x HTML5 JavaScript Client

The client uses a Google Maps Heatmap Layer to visually depict the intensity of the Uber trip cluster locations on a Manhattan Google map. With the Google Heatmap, areas of higher intensity will be colored red, and areas of lower intensity will appear green. The dashboard app uses Google Maps markers to mark cluster centers.

Picture 10

This example is all in a simple index.html for learning purposes. The necessary JavaScript for Vert.x, SockJS, jQuery, and Google Maps is shown below; note that for Google Maps, you will need your own key.

Picture 11

Creating the Map

For the map to display on the web page, we first reserve a spot for it by creating a named div element with div id="map". Then, in the initMap function, which is called when the page is loaded, we create a Google Maps instance, specifying a reference to the div element via the document.getElementById() method. Next, we create a HeatmapLayer object with empty geographic data in the form of an array. Later, we will update this data with geographic locations from the server.

Picture 12

Creating the Event Bus

Below, we create an instance of the vertx.EventBus object, specifying the URI location to connect. Then, we add an onopenlistener, which registers an event bus handler for the address “dashboard.” This handler will receive all messages published to the “dashboard” address.

Picture 13

The messages received from the server application are in JSON format and contain the following for each trip location: the cluster center id, datetime, latitude, and longitude for the trip, base for the trip, and latitude and longitude for the cluster center. An example is shown below:

{"cid":18, "dt":"2014-08-01 08:51:00", "lat":40.6858, "lon":-73.9923, "base":"B02682", "clat":40.67462874550765, "clon":-73.98667466026531}

In the event bus handler code below, we:

  • Parse the JSON message.
  • Add the trip’s longitude and latitude points to the points array, and then set this data on the Google Heatmap Layer object.
  • Add a marker to the map for this cluster center location if one has not already been added.
  • Increment the count of points received for this cluster center.

Picture 14

All of the components of the end-to-end application architecture discussed in this blog series can run on the same cluster with the MapR Converged Data Platform.

Downloading and Running the Example

Vert.x does not require an application server; it’s easy to run as a regular Java application with a fat JAR file containing the dependencies, as shown below:

$ java -jar ./target/mapr-streams-vertx-uberdashboard-1.0-SNAPSHOT-fat.jar web 8080 /apps/iot_stream:uberp

Additional Resources

ai, algorithms, api, k-means clustering, machine learning, real-time data, tutorial, uber, vert.x

Published at DZone with permission of Carol McDonald , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}