Logistic Regression in Machine Learning-
Logistic regression is a supervised machine learning algorithm mainly used for
classification tasks where the goal is to predict the probability that an instance
belongs to a given class or not. It is a kind of statistical algorithm, which analyze
the relationship between a set of independent variables and the dependent
binary variables. It is a powerful tool for decision-making. For example email spam
or not.
Logistic Regression-
Logistic regression is a supervised machine learning algorithm mainly used for
binary classification where we use a logistic function, also known as a sigmoid
function that takes input as independent variables and produces a probability
value between 0 and 1. For example, we have two classes Class 0 and Class 1 if
the value of the logistic function for an input is greater than 0.5 (threshold
value) then it belongs to Class 1 it belongs to Class 0. It’s referred to as
regression because it is the extension of linear regression but is mainly used for
classification problems. The difference between linear regression and logistic
regression is that linear regression output is the continuous value that can be
anything while logistic regression predicts the probability that an instance
belongs to a given class or not.
Understanding Logistic Regression-
Logistic regression predicts the output of a categorical dependent variable.
Therefore the outcome must be a categorical or discrete value.
It can be either Yes or No, 0 or 1, true or False, etc. but instead of giving the
exact value as 0 and 1, it gives the probabilistic values which lie between 0
and 1.
Logistic Regression is much similar to the Linear Regression except that
how they are used. Linear Regression is used for solving Regression
problems, whereas Logistic regression is used for solving the classification
problems.
In Logistic regression, instead of fitting a regression line, we fit an “S”
shaped logistic function, which predicts two maximum values (0 or 1).
The curve from the logistic function indicates the likelihood of something
such as whether the cells are cancerous or not, a mouse is obese or not
based on its weight, etc.
Logistic Regression is a significant machine learning algorithm because it
has the ability to provide probabilities and classify new data using
continuous and discrete datasets.
Logistic Regression can be used to classify the observations using different
types of data and can easily determine the most effective variables used for
the classification.
Logistic Function (Sigmoid Function):-
The sigmoid function is a mathematical function used to map the predicted
values to probabilities.
It maps any real value into another value within a range of 0 and 1. The
value of the logistic regression must be between 0 and 1, which cannot go
beyond this limit, so it forms a curve like the “S” form.
The S-form curve is called the Sigmoid function or the logistic function.
In logistic regression, we use the concept of the threshold value, which
defines the probability of either 0 or 1. Such as values above the threshold
value tends to 1, and a value below the threshold values tends to 0.
Differences b/w Linear and Logistic Regression
Sr.No Linear Regresssion Logistic Regression
Linear regression is used Logistic regression is used
to predict the continuous to predict the categorical
dependent variable using dependent variable using
a given set of a given set of independent
1 independent variables. variables.
Linear regression is used
It is used for solving
for solving Regression
classification problems.
2 problem.
3 In this we predict the In this we predict values of
value of continuous categorical varibles
Sr.No Linear Regresssion Logistic Regression
variables
4 In this we find best fit line. In this we find S-Curve .
Least square estimation Maximum likelihood
method is used for estimation method is used
5 estimation of accuracy. for Estimation of accuracy.
The output must be Output is must be
continuous value,such as categorical value such as
6 price,age,etc. 0 or 1, Yes or no, etc.
It required linear
relationship between It not required linear
dependent and relationship.
7 independent variables.
There may be collinearity There should not be
between the independent collinearity between
8 variables. independent varible.
Sigmoid Function
Now we use the sigmoid function where the input will be z and we find the
probability between 0 and 1. i.e predicted y.
Sigmoid function
As shown above, the figure sigmoid function converts the continuous variable
data into the probability i.e. between 0 and 1.