Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Getting Categorical Values for Predictors in H2O POJO and MOJO Models

DZone's Guide to

Getting Categorical Values for Predictors in H2O POJO and MOJO Models

Whether you prefer a MOJO or a POJO model with your H2O.ai projects, here's how you can fetch the categorical values for both.

· AI Zone ·
Free Resource

Bias comes in a variety of forms, all of them potentially damaging to the efficacy of your ML algorithm. Read how Alegion's Chief Data Scientist discusses the source of most headlines about AI failures here.

Here is a Java/Scala code snippet that shows how you can get the categorical values for each enum/factor predictor from H2O POJO and MOJO Models.

To get the list of all column names in your POJO/MOJO model, you can try the following:

Imports

import java.io.*;
import hex.genmodel.easy.RowData;
import hex.genmodel.easy.EasyPredictModelWrapper;
import hex.genmodel.easy.prediction.*;
import hex.genmodel.MojoModel;
import java.util.Arrays;


POJO

// First use the POJO model class as below:
private static String modelClassName = "gbm_prostate_binomial";

//Then you can use the GenModel class to get info you are looking for as below:
hex.genmodel.GenModel rawModel;
rawModel = (hex.genmodel.GenModel) Class.forName(modelClassName).newInstance();

//Now you can get the results as below:
System.out.println("isSupervised+ rawModel.isSupervised());
System.out.println("Columnss :  " + Arrays.toString(rawModel.getNames()));
System.out.println("Response " + rawModel.getResponseIdx());
System.out.println("Numberolumns : " + rawModel.getNumCols());
System.out.println("Response : " + rawModel.getResponseName());

//Printing all categorical values for each predictor
for (int i = 0; i < rawModel.getNumCols(); i++) 
{
    String[] domainValues = rawModel.getDomainValues(i);
    System.out.println(Arrays.toString(domainValues));
}


Output Results

isSupervised : true
 Column Names : [ID, AGE, RACE, DPROS, DCAPS, PSA, VOL, GLEASON]
 Response ID : 8
 Number of columns : 8
 null
 null
 [0, 1, 2]
 null
 null
 null
 null
 null


Note: Null values mean the predictor was numeric. All the categorical values are listed for each enum/factor predictor.

MOJO

//Let's assume you have a MOJO model named gbm_prostate_binomial.zip
//You would need to load your model as below:
hex.genmodel.GenModel mojo = MojoModel.load("gbm_prostate_binomial.zip## Now you can get list of predictors as below:
System.out.println("isSupervised+ mojo.isSupervised());
System.out.println("Columnss : " + Arrays.toString(mojo.getNames()));
System.out.println("Numberolumns : " + mojo.getNumCols());
System.out.println("Response " + mojo.getResponseIdx());
System.out.println("Response : " + mojo.getResponseName());

// Printing all categorical values for each predictor
for (int i = 0; i < mojo.getNumCols(); i++) {
    String[] domainValues = mojo.getDomainValues(i);
    System.out.println(Arrays.toString(domainValues));
}


Output Results

isSupervised : true
 Column Names : [ID, AGE, RACE, DPROS, DCAPS, PSA, VOL, GLEASON]
 Response ID : 8
 Number of columns : 8
 null
 null
 [0, 1, 2]
 null
 null
 null
 null
 null


Note: Null values mean the predictor was numeric. All the categorical values are listed for each enum/factor predictor.

To can get help on using MOJO and POJO models, visit the following pages:

That’s it, enjoy!!

Your machine learning project needs enormous amounts of training data to get to a production-ready confidence level. Get a checklist approach to assembling the combination of technology, workforce and project management skills you’ll need to prepare your own training data.

Topics:
ai ,mojo ,pojo ,h2o

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}