Over a million developers have joined DZone.

Using H2O AutoML for Safe Driver Prediction [Code Snippet]

DZone's Guide to

Using H2O AutoML for Safe Driver Prediction [Code Snippet]

Kaggle is currently hosting a Porto Seguro Safe Driver Prediction Competition. This H2O AutoML Python script is what I used!

· AI Zone ·
Free Resource

Insight for I&O leaders on deploying AIOps platforms to enhance performance monitoring today. Read the Guide.

If you are into competitive machine learning, you are probably visiting Kaggle routinely. Currently, you can compete for cash and recognition at the Porto Seguro’s Safe Driver Prediction, as well.

I did try to use the given training dataset as it is with H2O AutoML. It ran for about five hours and I was able to get into the top 280th position. If you can transform the dataset properly and run H2O AutoML, you may be able to get an even higher ranking.

Following is a very simple H2O AutoML Python script that you can try, as well. (Note: Make sure to change run_automl_for_seconds to the desired time that you want to run the experiment.)

import h2o
import pandas as pd
from h2o.automl import H2OAutoML

train = h2o.import_file('/data/avkash/PortoSeguro/PortoSeguroTrain.csv')
test = h2o.import_file('/data/avkash/PortoSeguro/PortoSeguroTest.csv')
sub_data = h2o.import_file('/data/avkash/PortoSeguro/PortoSeguroSample_submission.csv')

y = 'target'
x = train.columns

## Time to run the experiment
run_automl_for_seconds = 18000
## Running AML for 4 Hours
aml = H2OAutoML(max_runtime_secs =run_automl_for_seconds)
train_final, valid = train.split_frame(ratios=[0.9])
aml.train(x=x, y =y, training_frame=train_final, validation_frame=valid)

leader_model = aml.leader
pred = leader_model.predict(test_data=test)

pred_pd = pred.as_data_frame()
sub = sub_data.as_data_frame()

sub['target'] = pred_pd
sub.to_csv('/data/avkash/PortoSeguro/PortoSeguroResult.csv', header=True, index=False)

That’s it; enjoy!

TrueSight is an AIOps platform, powered by machine learning and analytics, that elevates IT operations to address multi-cloud complexity and the speed of digital transformation.

machine learning ,h2o ,python ,ai ,predictive analytics

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}