Using AWS Elastic MapReduce Results with Mobile BI Analytics
So far we covered server-side/cloud components – how to process data with MapReduce running in the cloud or on our own Hadoop cluster. This time it is about client-side.
If you have a look at Mary Meeker’s latest brilliant presentation about the Internet trends, one of the key messages is the significant increase in mobile 3G subscriptions and the mind-boggling sales figures for tablets (read: iPad) and smartphones (read: iPhone and Android):
Internet goes mobile and the applications follow the trend – that can be seen in mobile business intelligence, too that has shown a significant momentum recently. People are on the move with mobile devices that have similar performance as a notebook a few years ago, see geekbench results in here. It is time to use this power at hand for business intelligence, too. The tools are already out there to analyse big data and then publish results to mobile devices.
Amazon Elastic MapReduce
In the March post we covered Amazon Elastic MapReduce. Having talked about the mobile internet subscriptions and the enourmous growth in that area, this time we will analyse mobile subscriptions data from Worldbank. This data is about subscriptions to a public mobile telephone service using cellular technology, postpaid and prepaid subscriptions included.
To create an AWS Elastic MapReduce job requires 3 steps: upload input data to an S3 bucket/folder, run an EMR job (e.g. Hive, Pig, custom java), and download the output from an S3 folder.
The S3 storage looks like this for our test :there is a mobilesubscriptions bucket, then there are two folders: one for hive-scripts and one for mobilesubs data (folder). In the mobilesubs folder there is an input folder where we upload the mobile_subscriptions.csv file. The output will be created under s3://mobilesubscriptions/mobilesubs/output folder in csv format.
Its format is like:
Country Name,Country Code,2010 Afghanistan,AFG,37.80718336 Albania,ALB,141.8972543 Algeria,DZA,92.42180275 American Samoa,ASM, Andorra,AND,77.17642345 Angola,AGO,46.68902631....
(2010 is the last year where we had data)
The hive script the we use for data processing is – this will show the top 100 countries with the highest number of subscriptions:
CREATE EXTERNAL TABLE mobilesubs ( country_name STRING, country_code STRING, subscriptions FLOAT ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' LOCATION 's3://mobilesubscriptions/mobilesubs/input/'; CREATE TABLE top100_mobilesubs ( country_name STRING, country_code STRING, subscriptions FLOAT ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE; INSERT OVERWRITE TABLE top100_mobilesubs SELECT country_name, country_code, subscriptions FROM mobilesubs ORDER BY subscriptions DESC LIMIT 100; INSERT OVERWRITE DIRECTORY 's3://mobilesubscriptions/mobilesubs/output/' SELECT * from top100_mobilesubs;
The job that will process the data using AWS EMR is configured as follows:
Once we run the job, it will create a 000000_0 file under s3://mobilesubscriptions/mobilesubs/output directory.
This ouput files needs to be downloaded and processed to replaces the SOH characters with comma (,), in order to be able to publish it with Roambi Mobile BI analytics. This can be done by any text processing tool (e.g. notepad++)
Roambi Analytics has a cloud based publishing services and a mobile BI visualizer tool available for iPad and iPhones. The application can be installed on the mobile devices from Apple AppStore for free.
The Roambi publisher has 3 versions: Roambi Lite that is free and has limited functionality (support for csv, excel and html format), Roambi Pro (with additional Google docs and salesforce.com support) and Roambi Enterprise (with Oracle, SAP BusinessObjects, SAS, Microsoft, IBM Cognos, etc support).
This demo is based on Roambi Lite. First you need to create an account or login using Google Account (OpenID) at https://secure.roambi.com:
Then click on Publish:
Select the approriate view (e.g. CataList) and import data (this will be the mobilesubs_result.csv that we downloaded from AWS EMR s3://mobilesubscriptions/mobilesubs/output folder and prepared for Roambi Analytics as described above.
You can refine the data if you wish and then publish it:
The file will be pushed to the mobile devices (iPad or iPhone). In case of Roambi Lite e.g. you can push it to your own device.
Roambi Analytics Visualizer
On the handset you can retrieve the result using Roambi Analytics Visualiser. You can create an email or screenshot from the report, you can add it to favorites, etc.
Email sent from Roambi Analytics Visualizer:
As you can see, mobile BI and BigData in the cloud can free users from being a desktop slave: no need for datacenter infrastructure and no need for traditional desktop – just the joy of mobility spiced with the power of cloud computing.