Crime Analysis Using H2O Autoencoders (Part 2)
Crime Analysis Using H2O Autoencoders (Part 2)
Learn how to deploy a predictive analytics machine learning model by converting it into POJO/MOJO objects with the help of H2O functions.
Join the DZone community and get the full member experience.Join For Free
This is the second part of a two-part series of Crime Analysis using H2O Autoencoders. In Part 1, we discussed building the analytical pipeline and applying deep learning to predict the arrest status of the crimes happening in Los Angeles, California. Our machine learning model can be deployed as a JAR file using POJO and MOJO objects. H2O-generated POJO and MOJO models can be easily embeddable into Java environment based on the autogenerated
In this article, let's discuss deploying the H2O autoencoders model into a real-time production environment by converting it into POJO objects using H2O functions. As autoencoders don't support MOJO models, the POJO model is used in this article.
A crime dataset of Los Angeles from 2016-2017 with 224K records and 27 attributes is used as the source file. For more description, refer to our previous article.
Sample deployment model:
Deploy the H2O autoencoders model into the production environment.
- Generate JAR file for H2O autoencoder model.
- Run model.
- Deploy model into production environment.
- Implement machine learning model (Java Spring):
- Set up model execution project.
- Set up model deployment project.
- Perform overall production deployment.
Generating JAR File for H2O Autoencoder Model
The autoencoders model created from our previous analysis is as follows:
To generate the JAR file, perform the following:
- Download the autoencoders model using the
h2o.download_pojo()function in the H2O package.
- Execute the below syntax to create a Java file along with the JAR file:
- Download the Java file along with the JAR file using a Java Decompiler as shown in the below diagram:
Note: If the downloaded dependency JAR file does not contain logic to implement the autoencoder model, an
UnsupportedOperationException error will be thrown similar to the one shown in the below diagram:
The error can be viewed in the
PredictCsv.java file, as shown in the below diagram:
Similarly, you can view other models such as
MultinomialModelPrediction, and so on.
To overcome this exception error, perform the following:
- Download the latest version of the
h2o-genmodel.jarfile with all its dependencies.
- Place all the dependency files, along with the input file, in the same folder as the JAR file is placed in, as shown in the below diagram:
- View the new JAR file downloaded from the external site containing logic for the autoencoders, as shown in the below diagram:
You need a Java file from POJO object, an input file, and an
h2o-genmodel.jar file with its dependencies to run the model.
To run the model, perform the following:
test_input.csvas an input file and
output.csvas an output file.
- Run the model with all the dependencies using the below commands:
javac -cp h2o-genmodel.jar -J-Xmx2g crime_model_auto.java java -cp .;* hex.genmodel.tools.PredictCsv --header --model crime_model_auto --input test_input.csv --output output.csv
Note: As the autoencoders return reconstruction MSE error values for all columns for each class, the arrest status of the crimes cannot be predicted.
- Download the already trained supervised classification model as the POJO object using the pre-trained autoencoder model to predict the values.
- Create a separate folder named “pre-trained” for this process.
- Append all the JAR files into this folder.
- Copy and paste the dependency JAR files and inputs into this folder.
- Compile and run the Java file using the below commands:
- Obtain the output of our prediction model. The output looks similar to the one shown below:
From the above results, it is evident that our model works fine as a standalone Java file. Let's convert this model into a JAR file and move it into the production environment along with
h2o-genmodel.jar and input files.
Deploying Model Into Production Environment
To deploy the model into the production environment, perform the following:
- Convert the model into the JAR file with all the class files using the below command:
jar cf crime_model.jar *.class
- Place the above setup on any server and run the JAR file using the below command:
java -cp .;* hex.genmodel.tools.PredictCsv --header --model crime_pretrained --input test_input.csv --output output.csv
Implementing Machine Learning Model (Java Spring)
To implement the POJO model in the Java environment using Spring Framework, set up a simple Spring WebService project, and pass the input as JSON payload through a
Setting Up Model Execution Project
To set up a model execution project, perform the following:
- Parse an input CSV file and convert it into required Java collection objects.
- Convert the collection objects into JSON string to pass it as a JSON payload in the
- Create a function to make the JSON string as a valid request for our API call and to make all necessary connection objects within it.
A few class files in the project setup are:
CrimeModelExecution.java: Makes all the required function calls and converts the input file string into a valid JSON string. It is the core file for our project.
CSVParser.java: Parses a CSV file and converts it into required Java collections.
URLExecution.javaContains functions to make the JSON string as the valid request for our API call. It makes all necessary connection objects within it.
StringUtil.javaAll util functions are made in this class.
Setting Up Model Deployment Project
To set up the model deployment project, perform the following:
- Convert the execution project into the JAR file with all its dependencies.
- Initiate a server to run all APIs containing necessary logic to apply prediction on the dataset.
- Setup the project in a server environment and pass the required input files as parameters.
Few class files in the project setup are:
CrimeController.java: Contains all APIs required to apply Model Prediction for the datasets and to pass the input as JSON payload through POST call and as the File format in POST call.
UtilHelper.java: Performs basic string datatype conversions.
The project is implemented based on dependencies present in the
PredictCSV.java) file. Add this JAR to our classpath during implementation.
Performing Overall Production Deployment
The overall production deployment involves analyzing the input, implementing a model using R scripts, downloading the model into required Java objects, and implementing these objects in the production environment.
To deploy the model, perform the following:
- Upload all the codes in a specified location.
- Create separate batch files (in Windows environment) for implementing the R script.
- Make the project execution JAR.
- Deploy the model in the production environment as shown in the below diagram:
In this article, we discussed setting up a simple Spring webservice project in a Java environment and deploying a machine learning model in the real-time production environment using the command prompt and the POJO model. In our use case, the setup was performed on Windows, but it can be followed in any real-time server setup. The
h2o-genmodel.jar file contains all the dependencies and default functionalities required to build the model using Java.
To learn more about building the analytical pipeline and applying deep learning to predict the arrest status of the crimes happening in Los Angeles, consider our previous article.
Published at DZone with permission of Rathnadevi Manivannan . See the original article here.
Opinions expressed by DZone contributors are their own.