Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Crime Analysis Using H2O Autoencoders (Part 2)

DZone's Guide to

Crime Analysis Using H2O Autoencoders (Part 2)

Learn how to deploy a predictive analytics machine learning model by converting it into POJO/MOJO objects with the help of H2O functions.

· AI Zone ·
Free Resource

Insight for I&O leaders on deploying AIOps platforms to enhance performance monitoring today. Read the Guide.

This is the second part of a two-part series of Crime Analysis using H2O Autoencoders. In Part 1, we discussed building the analytical pipeline and applying deep learning to predict the arrest status of the crimes happening in Los Angeles, California. Our machine learning model can be deployed as a JAR file using POJO and MOJO objects. H2O-generated POJO and MOJO models can be easily embeddable into Java environment based on the autogenerated h2o-genmodel.jar file.

In this article, let's discuss deploying the H2O autoencoders model into a real-time production environment by converting it into POJO objects using H2O functions. As autoencoders don't support MOJO models, the POJO model is used in this article.

Dataset Description

A crime dataset of Los Angeles from 2016-2017 with 224K records and 27 attributes is used as the source file. For more description, refer to our previous article.

Sample deployment model:

select

Use Case

Deploy the H2O autoencoders model into the production environment.

Synopsis:

  • Generate JAR file for H2O autoencoder model.
  • Run model.
  • Deploy model into production environment.
  • Implement machine learning model (Java Spring):
    • Set up model execution project.
    • Set up model deployment project.
  • Perform overall production deployment.

Generating JAR File for H2O Autoencoder Model

The autoencoders model created from our previous analysis is as follows:

select

To generate the JAR file, perform the following:

    • Download the autoencoders model using the h2o.download_pojo() function in the H2O package.
    • Execute the below syntax to create a Java file along with the JAR file:

select

  • Download the Java file along with the JAR file using a Java Decompiler as shown in the below diagram:

select


Note: If the downloaded dependency JAR file does not contain logic to implement the autoencoder model, an UnsupportedOperationException error will be thrown similar to the one shown in the below diagram:

select

The error can be viewed in the PredictCsv.java file, as shown in the below diagram:

select

Similarly, you can view other models such as BinomialModelPrediction, MultinomialModelPrediction, and so on.

To overcome this exception error, perform the following:

    • Download the latest version of the h2o-genmodel.jar file with all its dependencies.
    • Place all the dependency files, along with the input file, in the same folder as the JAR file is placed in, as shown in the below diagram:

select

  • View the new JAR file downloaded from the external site containing logic for the autoencoders, as shown in the below diagram:

select

Running Model

You need a Java file from POJO object, an input file, and an h2o-genmodel.jar file with its dependencies to run the model.

To run the model, perform the following:

  • Use test_input.csv as an input file and output.csv as an output file.
  • Run the model with all the dependencies using the below commands:
javac -cp h2o-genmodel.jar -J-Xmx2g crime_model_auto.java
java -cp .;* hex.genmodel.tools.PredictCsv --header --model crime_model_auto --input test_input.csv --output output.csv

Note: As the autoencoders return reconstruction MSE error values for all columns for each class, the arrest status of the crimes cannot be predicted.

    • Download the already trained supervised classification model as the POJO object using the pre-trained autoencoder model to predict the values.
    • Create a separate folder named “pre-trained” for this process.
    • Append all the JAR files into this folder.
    • Copy and paste the dependency JAR files and inputs into this folder.
    • Compile and run the Java file using the below commands:

select

  • Obtain the output of our prediction model. The output looks similar to the one shown below:

select

From the above results, it is evident that our model works fine as a standalone Java file. Let's convert this model into a JAR file and move it into the production environment along with h2o-genmodel.jar and input files.

Deploying Model Into Production Environment

To deploy the model into the production environment, perform the following:

  • Convert the model into the JAR file with all the class files using the below command:
jar cf crime_model.jar *.class

select

  • Place the above setup on any server and run the JAR file using the below command:
java -cp .;* hex.genmodel.tools.PredictCsv --header --model crime_pretrained --input test_input.csv --output output.csv

Implementing Machine Learning Model (Java Spring)

To implement the POJO model in the Java environment using Spring Framework, set up a simple Spring WebService project, and pass the input as JSON payload through a POST call.

Setting Up Model Execution Project

To set up a model execution project, perform the following:

  • Parse an input CSV file and convert it into required Java collection objects.
  • Convert the collection objects into JSON string to pass it as a JSON payload in the POST call.
  • Create a function to make the JSON string as a valid request for our API call and to make all necessary connection objects within it.

Project setup:

select

A few class files in the project setup are:

  • CrimeModelExecution.java: Makes all the required function calls and converts the input file string into a valid JSON string. It is the core file for our project.
  • CSVParser.java: Parses a CSV file and converts it into required Java collections.
  • URLExecution.java Contains functions to make the JSON string as the valid request for our API call. It makes all necessary connection objects within it.
  • StringUtil.java All util functions are made in this class.

Setting Up Model Deployment Project

To set up the model deployment project, perform the following:

  • Convert the execution project into the JAR file with all its dependencies.
  • Initiate a server to run all APIs containing necessary logic to apply prediction on the dataset.
  • Setup the project in a server environment and pass the required input files as parameters.

Project setup:

select

Few class files in the project setup are:

  • CrimeController.java: Contains all APIs required to apply Model Prediction for the datasets and to pass the input as JSON payload through POST call and as the File format in POST call.
  • UtilHelper.java: Performs basic string datatype conversions.

The project is implemented based on dependencies present in the h2o-genmodel.jar (PredictCSV.java) file. Add this JAR to our classpath during implementation.

Performing Overall Production Deployment

The overall production deployment involves analyzing the input, implementing a model using R scripts, downloading the model into required Java objects, and implementing these objects in the production environment.

The flow of moving the machine learning models into the production environment is as follows:select

To deploy the model, perform the following:

  • Upload all the codes in a specified location.
  • Create separate batch files (in Windows environment) for implementing the R script.
  • Make the project execution JAR.
  • Deploy the model in the production environment as shown in the below diagram:

select

Conclusion

In this article, we discussed setting up a simple Spring webservice project in a Java environment and deploying a machine learning model in the real-time production environment using the command prompt and the POJO model. In our use case, the setup was performed on Windows, but it can be followed in any real-time server setup. The h2o-genmodel.jar file contains all the dependencies and default functionalities required to build the model using Java.

To learn more about building the analytical pipeline and applying deep learning to predict the arrest status of the crimes happening in Los Angeles, consider our previous article.

References

TrueSight is an AIOps platform, powered by machine learning and analytics, that elevates IT operations to address multi-cloud complexity and the speed of digital transformation.

Topics:
machine learning ,deep learning ,predictive analysis ,h2o ,autoencoders ,ai ,tutorial

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}