Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Using H2O AutoML for Safe Driver Prediction [Code Snippet]

DZone's Guide to

Using H2O AutoML for Safe Driver Prediction [Code Snippet]

Kaggle is currently hosting a Porto Seguro Safe Driver Prediction Competition. This H2O AutoML Python script is what I used!

· AI Zone ·
Free Resource

Start coding something amazing with the IBM library of open source AI code patterns.  Content provided by IBM.

If you are into competitive machine learning, you are probably visiting Kaggle routinely. Currently, you can compete for cash and recognition at the Porto Seguro’s Safe Driver Prediction, as well.

I did try to use the given training dataset as it is with H2O AutoML. It ran for about five hours and I was able to get into the top 280th position. If you can transform the dataset properly and run H2O AutoML, you may be able to get an even higher ranking.

Following is a very simple H2O AutoML Python script that you can try, as well. (Note: Make sure to change run_automl_for_seconds to the desired time that you want to run the experiment.)

import h2o
import pandas as pd
from h2o.automl import H2OAutoML

h2o.init()
train = h2o.import_file('/data/avkash/PortoSeguro/PortoSeguroTrain.csv')
test = h2o.import_file('/data/avkash/PortoSeguro/PortoSeguroTest.csv')
sub_data = h2o.import_file('/data/avkash/PortoSeguro/PortoSeguroSample_submission.csv')

y = 'target'
x = train.columns
x.remove(y)

## Time to run the experiment
run_automl_for_seconds = 18000
## Running AML for 4 Hours
aml = H2OAutoML(max_runtime_secs =run_automl_for_seconds)
train_final, valid = train.split_frame(ratios=[0.9])
aml.train(x=x, y =y, training_frame=train_final, validation_frame=valid)

leader_model = aml.leader
pred = leader_model.predict(test_data=test)

pred_pd = pred.as_data_frame()
sub = sub_data.as_data_frame()

sub['target'] = pred_pd
sub.to_csv('/data/avkash/PortoSeguro/PortoSeguroResult.csv', header=True, index=False)

That’s it; enjoy!

Start coding something amazing with the IBM library of open source AI code patterns.  Content provided by IBM.

Topics:
machine learning ,h2o ,python ,ai ,predictive analytics

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}