Over a million developers have joined DZone.

Convex Regression Model

DZone's Guide to

Convex Regression Model

In this post, a data scientist walks us through a bit of complex math and the corresponding R code that we need to make our models.

· Big Data Zone ·
Free Resource

The Architect’s Guide to Big Data Application Performance. Get the Guide.

This morning during the lecture on nonlinear regression, I mentioned (very) briefly the case of convex regression. Since I forgot to mention the codes in R, I will publish them here. Assume that yi=m(xi)+εi where m:RdR is some convex function.

Then m is convex if and only if x1,x2∈Rdt∈[0,1],

Image title

Hidreth (1954) proved that if

Image title

then θ⋆=(m⋆(x1),⋯,m⋆(xn)) is unique.

Let y=θ+ε, then

Image title


Image title

I.e. θ is the projection of \mathbf{y}y onto the (closed) convex cone \mathcal{K}K. The projection theorem gives existence and unicity.

For convenience, in the application, we will consider the real-valued case, m:RR, i.e. yi=m(xi)+εi. Assume that observations are ordered x1≤x2≤⋯≤xn. Here

Image title

Hence, quadratic program with n−2 linear constraints.

m is a piecewise linear function (interpolation of consecutive pairs (xi,θi⋆)).

If m is differentiable, m is convex if

Image title

More generally, if m is convex, then there exists ξxRn such that 

Image title

ξx is a subgradient of m at x. And then

Image title

Hence, θ is solution of 

Image title

and ξ1,⋯,ξnRn. Now, to do it for real, use cobs package for constrained (b)splines regression,


To get a convex regression, use

x = cars$speed
y = cars$dist
rc = conreg(x,y,convex=TRUE)
lines(rc, col = 2)

Here we can get the values of the knots


Call:  conreg(x = x, y = y, convex = TRUE) 
Convex regression: From 19 separated x-values, using 5 inner knots,
     7,    8,    9,   20,   23.
RSS =  1356; R^2 = 0.8766;
 needed (5,0) iterations

and actually, if we use them in a linear-spline regression, we get the same output here

reg = lm(dist~bs(speed,degree=1,knots=c(4,7,8,9,,20,23,25)),data=cars)
u = seq(4,25,by=.1)
v = predict(reg,newdata=data.frame(speed=u))

Let us add vertical lines for the knots


Learn how taking a DataOps approach will help you speed up processes and increase data quality by providing streamlined analytics pipelines via automation and testing. Learn More.

big data ,convex regression model ,data modeling

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}