# Linear Regression Using Numpy

The Web Dev Zone is brought to you in partnership with Mendix. Discover how IT departments looking for ways to keep up with demand for business apps has caused a new breed of developers to surface - the Rapid Application Developer.

A linear regression line is of the form w

_{1}x+w

_{2}=y and it is the line that minimizes the sum of the squares of the distance from each data point to the line. So, given n pairs of data (x

_{i}, y

_{i}), the parameters that we are looking for are w

_{1}and w

_{2}which minimize the error

and we can compute the parameter vector

**w**= (w

_{1}, w

_{2})

^{T}as the least-squares solution of the following over-determined system

Let's use numpy to compute the regression line:

from numpy import arange,array,ones,random,linalg from pylab import plot,show xi = arange(0,9) A = array([ xi, ones(9)]) # linearly generated sequence y = [19, 20, 20.5, 21.5, 22, 23, 23, 25.5, 24] w = linalg.lstsq(A.T,y)[0] # obtaining the parameters # plotting the line line = w[0]*xi+w[1] # regression line plot(xi,line,'r-',xi,y,'o') show()We can see the result in the plot below.

You can find more about data fitting using numpy in the following posts:

The Web Dev Zone is brought to you in partnership with Mendix. Learn more about The Essentials of Digital Innovation and how it needs to be at the heart of every organization.

## {{ parent.title || parent.header.title}}

## {{ parent.tldr }}

## {{ parent.linkDescription }}

{{ parent.urlSource.name }}