DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • A Beginner’s Guide to Hyperparameter Tuning: From Theory to Practice
  • Essential Python Libraries: Introduction to NumPy and Pandas
  • Smart Routing Using AI for Efficient Logistics and Green Solutions
  • Predicting Traffic Volume With Artificial Intelligence and Machine Learning

Trending

  • How to Build an Agentic AI SRE Co-Pilot for Incident Response
  • Amazon Quick: AWS's Agentic Workspace, Explained for Engineers
  • Compliance Automated Standard Solution (COMPASS), Part 11: Compliance as Code, the OSCAL MCP Server Way
  • Why Round-Robin Won't Save You: Load Balancing Challenges in Data Streaming Services With Heterogeneous Traffic

Linear Regression Using Numpy

By 
Giuseppe Vettigli user avatar
Giuseppe Vettigli
·
Mar. 26, 12 · Interview
Likes (0)
Comment
Save
Tweet
Share
14.0K Views

Join the DZone community and get the full member experience.

Join For Free
A few posts ago, we saw how to use the function numpy.linalg.lstsq(...) to solve an over-determined system. This time, we'll use it to estimate the parameters of a regression line.

A linear regression line is of the form w1x+w2=y and it is the line that minimizes the sum of the squares of the distance from each data point to the line. So, given n pairs of data (xi, yi), the parameters that we are looking for are w1 and w2 which minimize the error



and we can compute the parameter vector w = (w1 , w2)T as the least-squares solution of the following over-determined system



Let's use numpy to compute the regression line:
from numpy import arange,array,ones,random,linalg
from pylab import plot,show

xi = arange(0,9)
A = array([ xi, ones(9)])
# linearly generated sequence
y = [19, 20, 20.5, 21.5, 22, 23, 23, 25.5, 24]
w = linalg.lstsq(A.T,y)[0] # obtaining the parameters

# plotting the line
line = w[0]*xi+w[1] # regression line
plot(xi,line,'r-',xi,y,'o')
show()
We can see the result in the plot below.



You can find more about data fitting using numpy in the following posts:

  • Polynomial curve fitting
  • Curve fitting using fmin
Linear regression NumPy

Published at DZone with permission of Giuseppe Vettigli. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • A Beginner’s Guide to Hyperparameter Tuning: From Theory to Practice
  • Essential Python Libraries: Introduction to NumPy and Pandas
  • Smart Routing Using AI for Efficient Logistics and Green Solutions
  • Predicting Traffic Volume With Artificial Intelligence and Machine Learning

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook