Over a million developers have joined DZone.

IoT: Augmenting GPS Data with Weather (Part 2)

DZone's Guide to

IoT: Augmenting GPS Data with Weather (Part 2)

You can add weather data to GPS streams by geocoding latitude and longitude values to City and State variables and retrieving reports from Weather Underground.

· IoT Zone ·
Free Resource

Digi-Key Electronics’ Internet of Things (IoT) Resource Center Inspires the Future: Read More

Before you continue, if you haven't already, take a look at Part 1, where we set up our system. So with Hortonworks Data Flow's Flow Processing, I can augment any stream of data in motion. My GPS data stream from my RPIZW is returning latitude and longitude. That is valuable data that can be input to a lot of different APIs. One that seemed relevant to me was the current weather conditions where the device was. You could also do some predictive analytics to figure where you might be and check the current weather conditions or a weather forecast depending on how long it will take you to get there.

A GPS is giving us latitude and longitude that can be used as input to many kinds of APIs. Once use case that works for a moving device is checking the current weather conditions where you are. Weather Underground has a REST API call that will convert Lat/Long to City/State (there are other APIs that will do this as well, feel free to post them). Once I have City and State, I use that to get current weather conditions where this device is.

Apache NiFi Flow

(Click to enlarge)

Augment Live Data

EvaluateJsonPath: Create two variables (latitude and longitude) by extracting JSON fields from the flow file sent via MQTT. $.latitude and $.longitude

EvaluateJsonPath: Create two variables (City and State) by extracting JSON fields from the flow file sent from REST call. $.location.city $.location.state.

InvokeHTTP: City has space and perhaps other HTTP GET unfriendly data.

EvaluateJsonPath: Extract all the weather fields I am interested in, such as $.current_observation.dewpoint_string.

UpdateAttribute: A few field names Avro Schema's do not like. ${'Last-Modified'} I convert that to lastModified.

AttributesToJSON: Convert just the fields I want to a new flow file. Set property to flowfile-content.


InferAvroSchema: Let Nifi decide what this AVRO record should look like. I will add a part three using the schema registry and Apache NiFi 1.2, which will use a schema lookup.

ConvertJSONtoAvro: And make it SNAPPY!

MergeContent: As AVRO.

ConvertAvroToORC: ORC is the optimal choice for Hive LLAP and Spark SQL to query the data.

PutHDFS: Point to your configuration Nifi relative file system file /etc/hadoop/conf/core-site.xml. I like to set for replace conflict resolution and set your directory. Make sure you created your file directory and that Nifi has permissions to write there.

The quickest is to log into a machine with a HDFS client.

su hdfs
hdfs dfs -mkdir -p /rpwz/gpsweather   
hdfs dfs -chmod -R 777 /rpwz

Enhancement Ideas

Limit the number of calls to Weather Underground, try other weather services like NOAA and Weather Source. You may want to try other nation's weather APIs as well.

Create an External Hive Table and Display the Data (Zeppelin Can Do That)

Zeppelin lets me create the table with the DDL produced by the stream.

Then I can query it easily.

CREATE EXTERNAL TABLE IF NOT EXISTS gpsweather (observation_time STRING, dewpoint_string STRING, city STRING, windchill_f STRING, windchill_c STRING, latitude STRING, precip_today_string STRING, temp_c STRING, temp_f STRING, windchill_string STRING, wind_mph STRING, wind_degrees STRING, temperature_string STRING, weather STRING, feelslike_string STRING, wind_string STRING, heat_index_string STRING, state STRING, lastModified STRING, relative_humidity STRING, pressure_mb STRING, visibility_mi STRING, longitude STRING) 
STORED AS ORC LOCATION '/rpwz/gpsweather'
select * from gpsweather 

Note: Weather Underground has a free API that can use up to a certain number of calls. You will easily blast past that if you are tracking live objects, even if you decide to only enrich when points change or at 15-minute intervals. For testing, you may want to store weather underground results in JSON files in a directory and use GetFile to stand in for those calls. The same with data from your devices. Download the files from the Data Provenance and you can use them as stand-ins for live data for integration testing, especially when you may be on a plane or somewhere where you cannot reach your device.

Digi-Key’s IoT Component Selector is your one-stop-shop for the IoT

weather ,iot ,raspberry pi zero w ,data ingestion ,tutorial

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}