Over a million developers have joined DZone.

How to Regularize Intercept in GLM

DZone 's Guide to

How to Regularize Intercept in GLM

Learn about the parameters that can help you in the regularization of the H2O GLM model.

· AI Zone ·
Free Resource

Sometimes, you may want to emulate hierarchical modeling to achieve your objective. To do this, you can use beta_constraints as below:

iris = h2o.import_file("http://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_wheader.csv")
bc = h2o.H2OFrame([("Intercept",-1000,1000,3,30)], column_names=["names","lower_bounds","upper_bounds","beta_given","rho"])
glm = H2OGeneralizedLinearEstimator(family = "gaussian", 

The output will look like this:

{u'Intercept': 3.000933645168297,
 u'class.Iris-setosa': 0.0,
 u'class.Iris-versicolor': 0.0,
 u'class.Iris-virginica': 0.0,
 u'petal_len': 0.4423526962303227,
 u'petal_wid': 0.0,
 u'sepal_wid': 0.37712042938039897}

There’s more information in this GLM booklet, but the short version is to create new constraints frame with the columns: names, lower_bounds, upper_bounds, beta_given, and rho, and have a row entry per feature you want to constrain. You can use “Intercept” as a keyword to constrain the intercept.

names: (mandatory) coefficient names

lower bounds: (optional) coefficient lower bounds , must be less than or equal to upper bounds 

upper bounds: (optional) coefficient upper bounds , must be greater than or equal to lower bounds beta given: (optional) specifies the given solution in proximal operator interface 
rho (mandatory if beta given is specified, otherwise ignored): specifies per-column L2 penalties on the distance from the given solution

What’s happening is an L2 penalty is being applied to the coefficient and the given. The proximal penalty is computed: Σ(x-x’)*rho. You can confirm this by setting rho to be whatever lambda may be and set let lambda to 0. This will give the same result as having set lambda to that value. You can use beta constraints to assign per-feature regularization strength but only for the L2 penalty. The math is explained here:

sum_i rho[i] * L2norm2(beta[i]-betagiven[i])

So if you set beta to zero, and say all rho except for the intercept to 1e-5, then it is equivalent to running without BC — just with alpha = 0, lambda = 1e-5. 

ai ,glm ,hierarchical modeling ,tutorial

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}