DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Predicting Traffic Volume With Artificial Intelligence and Machine Learning
  • Demystifying Machine Learning: Unveiling Algorithms, Models, and Applications
  • Learning AI/ML: The Hard Way
  • Predicting Ad Viewability With XGBoost Regressor Algorithm

Trending

  • AI's Dilemma: When to Retrain and When to Unlearn?
  • Why Database Migrations Take Months and How to Speed Them Up
  • A Simple, Convenience Package for the Azure Cosmos DB Go SDK
  • Why High-Performance AI/ML Is Essential in Modern Cybersecurity
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Machine Learning Algorithms: Mathematics Behind Linear Regression

Machine Learning Algorithms: Mathematics Behind Linear Regression

In this article, we explain the real mathematics behind linear regression algorithm in machine learning.

By 
Shardul Bhatt user avatar
Shardul Bhatt
·
Jul. 29, 20 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
5.8K Views

Join the DZone community and get the full member experience.

Join For Free

There are several machine learning algorithms that can provide the desired outputs by processing the input data. One of the widely used algorithms is linear regression.

Linear regression is a type of supervised learning algorithm where the output is in a continuous range and isn’t classified into categories. Through a linear regression machine learning algorithm, we can predict values with a constant slope.

What Is Linear Regression Used For?

The most popular uses of linear regression in a machine learning system is predictive analytics and modeling. With the help of linear regression, we can quantify the relationship between the predictor variable and an output variable.

For example, We can quantify the impact of advertising on sales in a business, demographics on location tracking, age on height, and many more.

Linear regression is used in machine learning solutions to predict the future values. It is also known as multiple regression, multivariate regression, and ordinarily least squares.

There are various blogs explaining how to perform linear regression on various datasets. However, there are only a few articles explaining the mathematical formulae used in the backend when we use the linear regression classifier of sklearn (python library) or other libraries.

Here we will dive deep into the mathematics of linear regression.

Supervised Learning:

We are taking a simple example of a dataset having some values of houses as per their areas.

As you can see in our machine learning algorithm dataset, we have some value for areas and their respective prices, let’s say our input parameter x is Area and our output parameter y is Price.

When we have fixed output parameter y and input parameter x, this type of learning is called supervised learning.

So for a given machine learning training dataset, our goal is to learn a function h:x y so that h(x) is a prediction value for the corresponding value of y.

Function h is known as hypothesis.

Regression Problem:

When the target variable that we are trying to predict is continuous such as in our housing price prediction example, we call this learning problem a regression problem.

Classification Problem:

When y can take only a small number of discrete values we call it a classification problem.

Linear Regression:

Linear regression is a supervised learning algorithm in machine learning solutions used when the target / dependent variable continues in real numbers.

It establishes a relationship between dependent variable y and one or more independent variable x using the best fit line. It works on the principle of ordinary least square (OLS) / Mean square error (MSE).

In statistics OLS is a method to estimate unknown parameters of linear regression function, it’s goal is to minimize the sum of square differences between observed dependent variables in the given data set and those predicted by linear regression function.

We have three methods to draw the best fit line for linear regression.

  1. Batch Gradient descent
  2. Stochastic Gradient descent
  3. Normal equation

Lets make our dataset a little more richer to understand the concept in a broader manner.

According to our hypothesis our equation will be:

Now before jumping into example and cost function let us make several notations.

Let’s take some random values of x and y to train our model.

Java
 




x


 
1
<script src="https://gist.github.com/pranavbtc/1b4c1be1c8ebba96d844919afd7ac15a.js"></script>



Below scatter plot shows relationship between x and y.

To decide whether our line is best fitted or not we will define a cost function.

The error between predicted values and observed values is called residuals, and our cost function is nothing but the sum of squares of residuals.

Cost function is denoted by:

The linear regression algorithm in machine learning models passes through 1000s of iterations before arriving on a set of weights used to make the predictions. These iterations train the model to generate the desired output every time we input the predictor variable into the equation.

Conclusion: Mathematics for Machine Learning

Linear regression is the most basic type of machine learning algorithm used to predict the relationship between two variables. The factor being predicted is called the dependent variable as it requires the input variable for reaching that value.

One of the widely popular use cases of linear regression is in forecasting the sales of any company. Companies that have steady sales increase or decrease over the past few months can predict the future trend using this machine learning algorithm.

Today, more and more businesses are trying to eliminate uncertainty by utilizing machine learning models that make near accurate predictions. Machine learning as a service is widely used by enterprises of all kinds and industries to forecast demand, supply, estimate market trends, income, expenses, and even the overall growth.

In the next blog, we will try to do hands-on training using our first method Gradient Descent with the help of numpy library, we will also see the result of scikit learn library for the same above dataset.

Enjoy learning with machines!

Machine learning Linear regression Algorithm

Published at DZone with permission of Shardul Bhatt. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Predicting Traffic Volume With Artificial Intelligence and Machine Learning
  • Demystifying Machine Learning: Unveiling Algorithms, Models, and Applications
  • Learning AI/ML: The Hard Way
  • Predicting Ad Viewability With XGBoost Regressor Algorithm

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!