KEMBAR78
Logistic Regression Tutorial | PDF | Logistic Regression | Regression Analysis
100% found this document useful (1 vote)
257 views22 pages

Logistic Regression Tutorial

Logistic regression is an algorithm for binary classification that learns coefficients to predict the probability of an output being 1 or 0 based on the input features. It works by iteratively updating the coefficients to minimize error using stochastic gradient descent. On a sample dataset, a logistic regression model achieved 100% accuracy after 10 epochs of training.

Uploaded by

LUV ARORA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
257 views22 pages

Logistic Regression Tutorial

Logistic regression is an algorithm for binary classification that learns coefficients to predict the probability of an output being 1 or 0 based on the input features. It works by iteratively updating the coefficients to minimize error using stochastic gradient descent. On a sample dataset, a logistic regression model achieved 100% accuracy after 10 epochs of training.

Uploaded by

LUV ARORA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Logistic Regression

• Logistic regression is one of the most popular


machine learning algorithms for binary classification.
This is because it is a simple algorithm that performs
very well on a wide range of problems.
• How to calculate the logistic function.
• How to learn the coefficients for a logistic regression
model using stochastic gradient descent.
• How to make predictions using a logistic regression
model.
Raw Dataset
This dataset has two input variables (X1 and X2) and one output variable (Y).
In input variables are real-valued random numbers drawn from a Gaussian
distribution. The output variable has two values, making the problem a
binary classification problem

1 X1 X2 Y
2 2.7810836 2.550537003 0
3 1.4654893 2.362125076 0
4 3.3965616 4.400293529 0
5 1.3880701 1.850220317 0
6 3.0640723 3.005305973 0
7 7.6275312 2.759262235 1
8 5.3324412 2.088626775 1
9 6.9225967 1.77106367 1
10 8.6754186 -0.242068654 1
11 7.6737564 3.508563011 1
• Below is a plot of the dataset. You can see that
it is completely contrived and that we can
easily draw a line to separate the classes.
• This is exactly what we are going to do with
the logistic regression model.
plot
Logistic Function

• The logistic function is defined as:


• transformed = 1 / (1 + e^-x)
• Where e is the numerical constant Euler’s
number and x is a input we plug into the
function.
• a series of numbers from -5 to +5 and see how
the logistic function transforms them:
a series of numbers from -5 to +5 and see how the logistic function
transforms them:

1 X Transformed • all of the inputs have been


2 -5 0.006692850924
3 -4 0.01798620996 transformed into the range [0, 1]
4 -3 0.04742587318 • the smallest negative numbers
5 -2 0.119202922
6 -1 0.2689414214 resulted in values close to zero
7 0 0.5 • the larger positive numbers
8 1 0.7310585786
9 2 0.880797078 resulted in values close to one.
10 3 0.9525741268 • 0 transformed to 0.5 or the
11 4 0.98201379
12 5 0.9933071491
midpoint of the new range.
• as long as our mean value is zero, we can plug
in positive and negative values into the
function and always get out a consistent
transform into the new range.
Logistic Regression Model

• The logistic regression model takes real-valued


inputs and makes a prediction as to the
probability of the input belonging to the
default class (class 0).
• If the probability is > 0.5 we can take the
output as a prediction for the default class
(class 0), otherwise the prediction is for the
other class (class 1).
• For this dataset, the logistic regression has three
coefficients just like linear regression, for example:
• output = b0 + b1*x1 + b2*x2
• The job of the learning algorithm will be to
discover the best values for the coefficients (b0, b1
and b2) based on the training data.
• Unlike linear regression, the output is transformed
into a probability using the logistic function:
• p(class=0) = 1 / (1 + e^(-output))
Logistic Regression by Stochastic Gradient Descent

• estimate the values of the coefficients using stochastic gradient


descent.
• It works by using the model to calculate a prediction for each
instance in the training set and calculating the error for each
prediction.

• Given each training instance:

1. Calculate a prediction using the current values of the


coefficients.
2. Calculate new coefficient values based on the error in the
prediction.
• The process is repeated until the model is accurate
enough (e.g. error drops to some desirable level) or
for a fixed number iterations
• continue to update the model for training instances
and correcting errors until the model is accurate
enough or cannot be made any more accurate
• May randomize the order of the training instances
shown to the model to mix up the corrections made
Calculate Prediction

• Let’s start off by assigning 0.0 to each


coefficient and calculating the probability of the
first training instance that belongs to class 0.
• B0 = 0.0
• B1 = 0.0
• B2 = 0.0
• The first training instance is: x1=2.7810836,
x2=2.550537003, Y=0
• Using the above equation we can plug in all of
these numbers and calculate a prediction:
• prediction = 1 / (1 + e^(-(b0 + b1*x1 + b2*x2)))
• prediction = 1 / (1 + e^(-(0.0 + 0.0*2.7810836
+ 0.0*2.550537003)))
• prediction = 0.5
Calculate New Coefficients

• calculate the new coefficient values using a


simple update equation.
• b = b + alpha * (y – prediction) * prediction *
(1 – prediction) * x
• alpha is learning rate and controls how much
the coefficients (and therefore the model)
changes or learns each time it is updated
• Good values might be in the range 0.1 to 0.3
• Alpha = 0.3
• Let’s update the coefficients using the prediction
(0.5) and coefficient values (0.0) from the
previous section.
• b0 = b0 + 0.3 * (0 – 0.5) * 0.5 * (1 – 0.5) * 1.0
• b1 = b1 + 0.3 * (0 – 0.5) * 0.5 * (1 – 0.5) * 2.7810836
• b2 = b2 + 0.3 * (0 – 0.5) * 0.5 * (1 – 0.5) * 2.550537003

b0 = -0.0375
b1 = -0.104290635
b2 = -0.09564513761
Repeat the Process

• repeat this process and update the model for each


training instance in the dataset.
• A single iteration through the training dataset is
called an epoch. It is common to repeat the
stochastic gradient descent procedure for a fixed
number of epochs.
• At the end of epoch you can calculate error values
for the model. Because this is a classification
problem, it would be nice to get an idea of how
accurate the model is at each iteration.
The graph below show a plot of accuracy of
the model over 10 epochs
• the model very quickly achieves 100%
accuracy on the training dataset.
• The coefficients calculated after 10 epochs of
stochastic gradient descent are:
• b0 = -0.4066054641
• b1 = 0.8525733164
• b2 = -1.104746259
Make Predictions

• Using the coefficients above learned after 10


epochs, we can calculate output values for
each training instance
1 0.2987569857
2 0.145951056
3 0.08533326531
4 0.2197373144
5 0.2470590002
6 0.9547021348
7 0.8620341908
8 0.9717729051
9 0.9992954521
10 0.905489323
• convert these into crisp class values using:
prediction = IF (output < 0.5) Then 0
Else 1
1 0
2 0
3 0
4 0
5 0
6 1
7 1
8 1
9 1
10 1
• calculate the accuracy for the model on the
training dataset:
• accuracy = (correct predictions / num
predictions made) * 100
• accuracy = (10 /10) * 100
• accuracy = 100%

You might also like