# Simplifying Logistic Regression

# Simplifying Logistic Regression

Learn how to use linear regression for use cases like predictive analytics, classification, probability, and more.

Join the DZone community and get the full member experience.

Join For Free**Bias comes in a variety of forms, all of them potentially damaging to the efficacy of your ML algorithm. Read how Alegion's Chief Data Scientist discusses the source of most headlines about AI failures here.**

Logistic regression is one of the most popular machine learning algorithms for binary classification. This is because it is a simple algorithm that performs very well on a wide range of problems. It is used when you know that the data is linearly separable/classifiable and the outcome is *binary* or *dichotomous* but it can be extended when the dependent has more than two categories. Logistic regression is used when the outcome is a discrete variable, such as trying to figure out who will win the election, whether a student will pass an exam or not, or whether an email is a spam. This is commonly called as a classification problem because we are trying to determine which class the dataset best fits.

## Linear Regression vs. Logistic Regression

In linear regression, the outcome is continuous. It can have any one of an infinite number of possible values while In logistic regression, the outcome has only a limited number of possible values, generally 0 or 1.

Linear regression needs to establish the linear relationship between dependent and independent variable whereas it is not necessary for logistic regression.

In linear regression, the independent variables can be correlated with each other. On the contrary, in logistic regression, the variable must not be correlated with each other.

## Terminology Used for Logistic Regression

**Probability**: Probability is the measure of the likelihood that an event will occur.Probability is quantified as a number between 0 and 1.**Odds**: Odds is the ratio of the probability of occurring of an event and probability of not occurring such as*odds = P(Occurring)/P(Not occurring)*where*P*=*Probability*.**Odds ratio**: Odds ratio for a variable in logistic regression that represents how the odds change with 1 unit increase in that variable holding all the other variables as constant. It can be defined as the ratio of two odds.**Logit**:*ln(odds) = ln(p/1-p) = logit(p)*.

In logistic regression, we estimate an unknown probability for any given linear combination of independent variables.

Regression coefficients for logistic regression are calculated using maximum likelihood estimation (MLE).

So, in the above equation, *x* is the independent variable and the rest of them are regression coefficients. This equation is called estimated regression.

## Applications of Logistic Regression

Logistic regression can be applied anywhere that the outcome is binary, such as 0 or 1. Some of the applications of logistic regression are:

- Predicting whether a student will pass or not on the basis of hours of study or any relevant information.
- Predicting the approval of loan on the basis of credit score.
- Predicting the failure of a firm.

*This article was first published on the Knoldus blog.*

**Your machine learning project needs enormous amounts of training data to get to a production-ready confidence level. Get a checklist approach to assembling the combination of technology, workforce and project management skills you’ll need to prepare your own training data.**

Published at DZone with permission of Himanshu Rajput , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

## {{ parent.title || parent.header.title}}

{{ parent.tldr }}

## {{ parent.linkDescription }}

{{ parent.urlSource.name }}