Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Launching H2O Clusters on Different Ports in pysparkling [Code Snippets]

DZone's Guide to

Launching H2O Clusters on Different Ports in pysparkling [Code Snippets]

Learn how to launch an H2O machine learning cluster using the pysparkling package with the approrpiate Python code script.

· AI Zone ·
Free Resource

Bias comes in a variety of forms, all of them potentially damaging to the efficacy of your ML algorithm. Read how Alegion's Chief Data Scientist discusses the source of most headlines about AI failures here.

In this example, we will launch an H2O machine learning cluster using the pysparkling package. You can visit my GitHub and this article to learn more about the code execution explained here.

First, install pysparkling in Python 2.7, set up as below:

> pip install -U h2o_pysparkling_2.1

Now we can launch the pysparkling Shell as below:

SPARK_HOME=/Users/avkashchauhan/tools/spark-2.1.0-bin-hadoop2.6

Launch pysparkling shell:

~/tools/sw2/sparkling-water-2.1.14 $ bin/pysparkling

Here's the Python code script to launch the H2O cluster in pysparkling:

## Importing Libraries
from pysparkling import *
import h2o

## Setting H2O Conf Object
h2oConf = H2OConf(sc)
h2oConf

## Setting H2O Conf for different port
h2oConf.set_client_port_base(54300)
h2oConf.set_node_base_port(54300)

## Gett H2O Conf Object to see the configuration
h2oConf

## Launching H2O Cluster
hc = H2OContext.getOrCreate(spark, h2oConf)

## Getting H2O Cluster status
h2o.cluster_status()

Now, if you verify the sparkling water configuration, you will see that H2O is running on the given IP and port 54300 as configured:

Sparkling Water configuration:
  backend cluster mode : internal
  workers              : None
  cloudName            : Not set yet, it will be set automatically before starting H2OContext.
  flatfile             : true
  clientBasePort       : 54300
  nodeBasePort         : 54300
  cloudTimeout         : 60000
  h2oNodeLog           : INFO
  h2oClientLog         : WARN
  nthreads             : -1
  drddMulFactor        : 10

That's it; enjoy!

Your machine learning project needs enormous amounts of training data to get to a production-ready confidence level. Get a checklist approach to assembling the combination of technology, workforce and project management skills you’ll need to prepare your own training data.

Topics:
h2o ,python ,pysparkling ,machine learning ,ai ,tutorial

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}