0% found this document useful (0 votes)

13 views10 pages

Logistic Regression

Logistic regression is a statistical method for binary classification that models the probability of a given input belonging to a particular class using a sigmoid function. It is particularly effective in healthcare for predicting outcomes like disease presence, providing interpretable probabilities for clinicians. The document also discusses the cost function, performance metrics, and the importance of the confusion matrix in evaluating model performance.

Uploaded by

ravindrababu.jonnadula

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views10 pages

Logistic Regression

Uploaded by

ravindrababu.jonnadula

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

Logistic Regression

Logistic regression is a statistical method used for binary classification tasks, where the target
variable has only two possible outcomes (e.g., yes/no, true/false, diabetic/not diabetic). Unlike linear
regression, which predicts continuous outcomes, logistic regression models the probability that a
given input belongs to a particular class using a logistic (sigmoid) function. This function maps any
real-valued input to a value between 0 and 1, representing the probability of the positive class (e.g.,
diabetic).

Why Binary Classification Works Well:

In healthcare data analysis, binary classification is often preferred for predicting outcomes with two
possible states, such as disease presence or absence. Logistic regression excels in this context
because it provides interpretable probabilities, allowing clinicians to assess the likelihood of a patient
having a particular condition based on their features.

Example: Predicting Diabetes with Logistic Regression

Suppose we have a dataset containing patient information, including features like blood pressure
(BP), body mass index (BMI), pregnancy status, and the target variable indicating whether the patient
is diabetic (1) or not diabetic (0). Let's say we have eight features (BP, BMI, pregnancy, and others)
and the ninth feature is the target variable indicating diabetes status.

The logistic regression model will estimate the probability of a patient being diabetic based on their
feature values. By fitting the model to the training data, it learns the relationship between the
features and the probability of diabetes, represented by a best-fit line in feature space.
In logistic regression, we aim to model the probability that a given input belongs to a particular class
(e.g., diabetic or not diabetic). Unlike linear regression, where the outcome is continuous, logistic
regression outputs probabilities bounded between 0 and 1. To achieve this, we use the sigmoid
function, also known as the logistic function.

The sigmoid function forms an S shaped graph, which means as x approaches infinity, the probability
becomes 1, and as x approaches negative infinity, the probability becomes 0. The model sets a
threshold that decides what range of probability is mapped to which binary variable. Suppose we
have two possible outcomes, true and false, and have set the threshold as 0.5. A probability less than
0.5 would be mapped to the outcome false, and a probability greater than or equal to 0.5 would be
mapped to the outcome true.

Sigmoid Function graph

The formula of the sigmoid function is:

f(x)= 1/(1 + e^-x)

and since, y = mx + c
then, Probability of class 0:

𝛔y = 1/(1 + e-y)
𝛔y = 1/(1 + e-mx+c)
𝛔y = emx+c/(1 + emx+c)

Probability of class 1:

1 - 𝛔y = 1/(1 + emx+c)

Now,

𝛔y/(1 - 𝛔y) = emx+c

loge(𝛔y/(1 - 𝛔y)) = mx + c loge(e)

loge(𝛔y/(1 - 𝛔y)) = mx + c

Using the Logistic Regression Equation to Make Predictions:

Once we have trained the logistic regression model and obtained the optimal parameters θ, we can
use the model to make predictions on new data.

 We compute the linear combination y of the features and model parameters.

 We then pass y through the sigmoid function to obtain the predicted probability.

 If y is less than 0.5, we predict the sample belongs to the negative class (0); otherwise, we
predict it belongs to the positive class (1).

Visualizing the Best-Fit Line:

To visualize the best-fit line in logistic regression, we can plot the sigmoid function against the input
feature(s). This will show how the predicted probabilities change as the feature values vary. The
decision boundary, where the predicted probability is 0.5, separates the two classes.

Conclusion:

In summary, logistic regression finds the best-fit line by maximizing the likelihood of observing the
target outcomes given the feature values and model parameters. Through the sigmoid function,
logistic regression outputs probabilities bounded between 0 and 1, allowing for binary classification.
By iteratively optimizing the model parameters, logistic regression provides an effective framework
for predicting binary outcomes and drawing decision boundaries between classes.
Cost Function of Logistic Regression

A cost function is a mathematical function that calculates the difference between the target actual
values (ground truth) and the values predicted by the model. A function that assesses a machine
learning model’s performance also referred to as a loss function or objective function. Usually, the
objective of a machine learning algorithm is to reduce the error or output of cost function.

Log loss and Cost function for Logistic Regression

One of the popular metrics to evaluate models for classification by using probabilities is log loss.

F=−∑(i=1 to M) yilog (hθ(xi))+(1−yi)log(1−hθ(xi))

The cost function can be written as:

F(θ)=1/n∑(i=1 to n) 1/2[hθ(xi)−Yi]2

For Logistic Regression,

hθ(x)=g(θTx)

The above equation leads to a non−convex function that acts as the cost function. The cost function
logistic regression is log loss and is summarized below.

cost(hθ(x), y) = -log(hθ(x)) , when y=1

and

cost(hθ(x), y) = -log(1 - hθ(x)) , when y=0

where,

 y is the actual value of the target variable,

 hθ (x) is the predicted probability that y=1 given , X and parameterized by θ.

 yi is the actual label for the i th training example.

This cost function penalizes the model with a higher loss when its prediction diverges from the actual
label. Specifically, it imposes a large penalty when the model confidently predicts the wrong class
(i.e., high probability for the incorrect class).
Why Mean Squared Error not suitable for Logistic Regression?

Let’s consider the Mean Squared Error (MSE) as a cost function, but it is not suitable for logistic
regression due to its nonlinearity introduced by the sigmoid function.

MSE = 1/2m Σ (i=1 to m) (σ(i) - yi)2

In logistic regression, if we substitute the sigmoid function into the above MSE equation, we get

The equation 1/(1+ez) is a nonlinear transformation, and evaluating this term within the Mean
Squared Error formula results in a non-convex cost function. A non-convex function, have multiple
local minima which can make it difficult to optimize using traditional gradient descent algorithms as
shown below.
Imagine you have a function that looks like a series of hills and valleys, with multiple peaks and
troughs scattered throughout. This type of function is called non-convex because it doesn't have a
single, well-defined minimum point; instead, it has multiple local minima (valleys) and potentially
even some local maxima (peaks).

When you're trying to optimize such a function, the goal is to find the lowest point, which
corresponds to the global minimum. However, because of the presence of multiple local minima,
traditional gradient descent algorithms can encounter difficulties.

Why is it challenging?

1. Getting Stuck in Local Minima: Gradient descent algorithms, like the one used in logistic
regression, work by iteratively moving in the direction of the steepest descent of the function.
However, if they start from an initial point that is not the global minimum and there are multiple
local minima, they might get trapped in one of the local minima instead of reaching the global
minimum. Once stuck in a local minimum, the algorithm cannot escape it to find the true minimum.

2. Plateaus and Saddle Points: In addition to local minima, non-convex functions may have plateaus
(flat regions) and saddle points (points where the gradient is zero but not a minimum or maximum).
These features can slow down or stall the convergence of gradient descent algorithms, making
optimization even more challenging.
1.

2.
Performance Metrics

In the realm of machine learning, the ability to accurately assess the performance of a model is
paramount. Model evaluation metrics serve as the compass guiding practitioners to understand how
well their models are performing on a given task. One such fundamental tool in model evaluation is
the confusion matrix, which provides insights into the performance of a classification model. Let's
delve into the intricacies of the confusion matrix and explore the key metrics derived from it:
accuracy, precision, recall, specificity, and F1-score.

Understanding the Confusion Matrix:

A confusion matrix is a tabular representation of the performance of a classification model that

categorizes predictions into four categories:

 True Positives (TP): Instances where the model correctly predicts the positive class.

 True Negatives (TN): Instances where the model correctly predicts the negative class.

 False Positives (FP): Instances where the model incorrectly predicts the positive class (Type I
error).

 False Negatives (FN): Instances where the model incorrectly predicts the negative class (Type
II error).

EXAMPLE
A machine learning model is trained to predict diabetes in patients. The test dataset consists of 100
people.

True Positive (TP) — model correctly predicts the positive class (prediction and actual both are
positive). In the above example, 10 people who have diabetes are predicted positively by the model.

True Negative (TN) — model correctly predicts the negative class (prediction and actual both are
negative). In the above example, 60 people who don’t have diabetes are predicted negatively by the
model.

False Positive (FP) — model gives the wrong prediction of the negative class (predicted-positive,
actual-negative). In the above example, 22 people are predicted as positive of having a diabetes,
although they don’t have a diabetes. FP is also called a TYPE I error.

False Negative (FN) — model wrongly predicts the positive class (predicted-negative, actual-
positive). In the above example, 8 people who have diabetes are predicted as negative. FN is also
called a TYPE II error.

With the help of these four values, we can calculate True Positive Rate (TPR), False Negative Rate
(FPR), True Negative Rate (TNR), and False Negative Rate (FNR).

Even if data is imbalanced, we can figure out that our model is working well or not. For that, the
values of TPR and TNR should be high, and FPR and FNR should be as low as possible.

With the help of TP, TN, FN, and FP, other performance metrics can be calculated.

Accuracy

Accuracy measures the overall correctness of the model's predictions and is calculated as the ratio of
correct predictions (TP + TN) to the total number of predictions (TP + TN + FP + FN).

Accuracy = (TP + TN)/(TP + TN + FP + FN)

Specificity

Specificity measures the ability of the model to correctly identify negative instances out of all actual
negative instances. It is calculated as the ratio of true negatives (TN) to the total number of actual
negatives (TN + FP).

Specificity= TN / (TN + FP)

Precision:

Precision quantifies the ability of the model to correctly identify positive instances out of all
instances predicted as positive. It is calculated as the ratio of true positives (TP) to the total number
of predicted positives (TP + FP).

Precision = TP/ (TP + FP)

Recall (Sensitivity or True Positive Rate):

Recall measures the ability of the model to correctly identify positive instances out of all actual
positive instances. It is calculated as the ratio of true positives (TP) to the total number of actual
positives (TP + FN).

Recall = TP/(TP+FN)

Comparing Recall and Precision in Diabetic Prediction:

1. High Recall, Low Precision: In this scenario, the model captures a high proportion of diabetic
patients (high recall) but may also incorrectly label many non-diabetic individuals as diabetic (low
precision). While this ensures that diabetic patients are not missed, it may lead to unnecessary tests
or treatments for non-diabetic individuals.

2. High Precision, Low Recall: Conversely, in this scenario, the model correctly identifies diabetic
patients with high precision but may miss a significant number of diabetic patients (low recall). While
this minimizes unnecessary interventions for non-diabetic individuals, it increases the risk of
undiagnosed diabetes and its associated complications.

F1-Score:

The F1-score is the harmonic mean of precision and recall and provides a balanced measure of a
model's performance. It is calculated as:

F1-Score=2× (Precision+Recall/Precision×Recall)

Interpreting the Confusion Matrix Metrics:

 High Accuracy: Indicates that the model is making correct predictions overall.

 High Precision: Indicates that when the model predicts positive, it is very likely to be correct.

 High Recall: Indicates that the model is able to identify most of the positive instances.

 High Specificity: Indicates that the model is able to identify most of the negative instances.

 High F1-Score: Indicates a good balance between precision and recall.

Lecture 07
No ratings yet
Lecture 07
26 pages
Unit 3-ML
No ratings yet
Unit 3-ML
99 pages
Cost Function of Logistic Regression
No ratings yet
Cost Function of Logistic Regression
6 pages
Practical - Logistic Regression
No ratings yet
Practical - Logistic Regression
84 pages
Classification-Introduction, Logistic Regression
No ratings yet
Classification-Introduction, Logistic Regression
26 pages
Algorithms Notes
No ratings yet
Algorithms Notes
66 pages
Exp 2
No ratings yet
Exp 2
7 pages
Logistic Regression by IntuitiveAI v2.5
No ratings yet
Logistic Regression by IntuitiveAI v2.5
8 pages
Logistic Regression for Analysts
No ratings yet
Logistic Regression for Analysts
33 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
Report Logistic Regression
No ratings yet
Report Logistic Regression
21 pages
09 23ECE216 LogisticRegression
No ratings yet
09 23ECE216 LogisticRegression
40 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Logistic Regression
No ratings yet
Logistic Regression
5 pages
ML Lec-9
No ratings yet
ML Lec-9
13 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Week 7
No ratings yet
Week 7
21 pages
Logistic Regression
No ratings yet
Logistic Regression
36 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
23 pages
Machine Learning Regression Basics
No ratings yet
Machine Learning Regression Basics
22 pages
Eml 24.7.25
No ratings yet
Eml 24.7.25
23 pages
ML DSBA Lab2
No ratings yet
ML DSBA Lab2
4 pages
Logistic Regression Guide
No ratings yet
Logistic Regression Guide
23 pages
Regression vs Classification Algorithms
100% (1)
Regression vs Classification Algorithms
13 pages
3 Intro To Logistic Regression LT
No ratings yet
3 Intro To Logistic Regression LT
18 pages
11logistic Regression in Machine Learning - GeeksforGeeks
No ratings yet
11logistic Regression in Machine Learning - GeeksforGeeks
4 pages
A Research Project On Applying Logistic Regression To Predict Result of Binary Classification Problems
No ratings yet
A Research Project On Applying Logistic Regression To Predict Result of Binary Classification Problems
6 pages
M02Logistic Regression Logistic RegressioLogistic Regressionn
No ratings yet
M02Logistic Regression Logistic RegressioLogistic Regressionn
19 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
25 pages
13 Logistic Regression Main
No ratings yet
13 Logistic Regression Main
14 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
56 pages
ML Assignment
No ratings yet
ML Assignment
20 pages
Logistic Regression
No ratings yet
Logistic Regression
37 pages
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
100% (1)
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
42 pages
Chp2 Logistic Regression
No ratings yet
Chp2 Logistic Regression
6 pages
ML-Unit I - Logistic Regression
No ratings yet
ML-Unit I - Logistic Regression
102 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
41 pages
Lec 02 LogisticReg
No ratings yet
Lec 02 LogisticReg
33 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
Logistic Regression
No ratings yet
Logistic Regression
12 pages
Linear and Logistic Regression
No ratings yet
Linear and Logistic Regression
21 pages
Lecture 08
No ratings yet
Lecture 08
42 pages
11-Logistic Regression
No ratings yet
11-Logistic Regression
27 pages
Logistic Regression
No ratings yet
Logistic Regression
16 pages
Ziad Aladawy - Logistic Regressio
No ratings yet
Ziad Aladawy - Logistic Regressio
54 pages
ML2 Logistic Regression
No ratings yet
ML2 Logistic Regression
23 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
53 pages
Logistic Regression
No ratings yet
Logistic Regression
34 pages
Lecture Notes 6 Logistic Regression
No ratings yet
Lecture Notes 6 Logistic Regression
8 pages
B.Tech V KCS055 Unit2 2
No ratings yet
B.Tech V KCS055 Unit2 2
7 pages
Csa Lab 3
No ratings yet
Csa Lab 3
14 pages
ML CLASS 5 Logistic Regression Algorithm
No ratings yet
ML CLASS 5 Logistic Regression Algorithm
16 pages
Lecture 8 Logistic Regression
No ratings yet
Lecture 8 Logistic Regression
34 pages
What Is Logistic Regression
No ratings yet
What Is Logistic Regression
20 pages
Logistic Regression Interview Prep
No ratings yet
Logistic Regression Interview Prep
9 pages
Logistic Regression Class Notes
No ratings yet
Logistic Regression Class Notes
3 pages
Cost Function For Logistic Regression
No ratings yet
Cost Function For Logistic Regression
42 pages
Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
10 pages
DAA Assignment1 ds7
No ratings yet
DAA Assignment1 ds7
16 pages
DAA - 1.2-Divide and Conquer - Tulasi
No ratings yet
DAA - 1.2-Divide and Conquer - Tulasi
10 pages
IT 313 Design Analysis of Algorithms - Model Paper (R20)
No ratings yet
IT 313 Design Analysis of Algorithms - Model Paper (R20)
5 pages
DAA 1.1 Introduction Updated
No ratings yet
DAA 1.1 Introduction Updated
35 pages
Performance - Analysis - Chapter1
No ratings yet
Performance - Analysis - Chapter1
38 pages
DAA - 2.2-Mincost Spanning Tree
No ratings yet
DAA - 2.2-Mincost Spanning Tree
29 pages
Short Answer Questions (Part-2)
No ratings yet
Short Answer Questions (Part-2)
2 pages
Mid 1 - DAA - 2024-25
No ratings yet
Mid 1 - DAA - 2024-25
1 page
DAA Assig-I
No ratings yet
DAA Assig-I
1 page
Chapter6 Transport Layer 30-09-2024
No ratings yet
Chapter6 Transport Layer 30-09-2024
119 pages
SQL Handwritten Notes - 1
No ratings yet
SQL Handwritten Notes - 1
41 pages
CN Assignment-1 Ds7
No ratings yet
CN Assignment-1 Ds7
25 pages
Short Answer Questions (Part-1)
No ratings yet
Short Answer Questions (Part-1)
1 page
CN Question Bank
No ratings yet
CN Question Bank
2 pages
Or 1marks IT
No ratings yet
Or 1marks IT
10 pages
Moving To Design
No ratings yet
Moving To Design
34 pages
UNIT-1 Notes On Types of Bussiness Organisations
No ratings yet
UNIT-1 Notes On Types of Bussiness Organisations
7 pages
Unit 2 Sessional
No ratings yet
Unit 2 Sessional
4 pages
Unit 1 Sessional
No ratings yet
Unit 1 Sessional
6 pages
Building The Analysis Model
No ratings yet
Building The Analysis Model
32 pages
Why Do We Need A K-NN Algorithm?
No ratings yet
Why Do We Need A K-NN Algorithm?
11 pages
Architectural Design
No ratings yet
Architectural Design
45 pages
Files
No ratings yet
Files
4 pages
Clustering
No ratings yet
Clustering
18 pages
Ch5 - CPU Scheduling
No ratings yet
Ch5 - CPU Scheduling
60 pages
Project Report G00061
No ratings yet
Project Report G00061
15 pages
Provided by K-State Research Exchange
No ratings yet
Provided by K-State Research Exchange
130 pages
Mobility-On-Demand Versus Fixed-Route Transit Systems: An Evaluation of Traveler Preferences in Low-Income Communities
No ratings yet
Mobility-On-Demand Versus Fixed-Route Transit Systems: An Evaluation of Traveler Preferences in Low-Income Communities
27 pages
Project LDA
100% (1)
Project LDA
32 pages
Week01 Lecture BB
No ratings yet
Week01 Lecture BB
70 pages
Logistic+Regression - Done
100% (1)
Logistic+Regression - Done
41 pages
Assignment Report - Predictive Modelling - Rahul Dubey
No ratings yet
Assignment Report - Predictive Modelling - Rahul Dubey
18 pages
2009 - Christie - An Economic Assessment of The Amenity Benefits Associated With Alternative Coastal Defence Options
No ratings yet
2009 - Christie - An Economic Assessment of The Amenity Benefits Associated With Alternative Coastal Defence Options
20 pages
Answers To Simple Logistics Regression Questions
No ratings yet
Answers To Simple Logistics Regression Questions
8 pages
Nonparametric Test: DR - Dr. Siswanto, MSC
No ratings yet
Nonparametric Test: DR - Dr. Siswanto, MSC
44 pages
Abiyot Research Methods Worksheet Yom Revised
No ratings yet
Abiyot Research Methods Worksheet Yom Revised
4 pages
Predictive Analytics Complete Notes
100% (2)
Predictive Analytics Complete Notes
82 pages
ISI Kolkata Placement Prep Guide
No ratings yet
ISI Kolkata Placement Prep Guide
9 pages
Chapter 4: Linear Models For Classification: Grit Hein & Susanne Leiberg
No ratings yet
Chapter 4: Linear Models For Classification: Grit Hein & Susanne Leiberg
21 pages
Thesis Docc
No ratings yet
Thesis Docc
73 pages
Poisson Regression for Counts
No ratings yet
Poisson Regression for Counts
51 pages
Retention Challenges in Indian Skill Development Program A Comprehensive Study
No ratings yet
Retention Challenges in Indian Skill Development Program A Comprehensive Study
23 pages
Risky Sexual Behaviors Among Unmarried Youth in India: Evidences From National Family Health Survey, 2019-21
No ratings yet
Risky Sexual Behaviors Among Unmarried Youth in India: Evidences From National Family Health Survey, 2019-21
27 pages
Prediction and Analysis of Customer Complaints Usi
No ratings yet
Prediction and Analysis of Customer Complaints Usi
25 pages
Bruce F. McKeown - Q Methodology (Quantitative Applications in The Social Sciences) - SAGE Publications, Inc (2013)
No ratings yet
Bruce F. McKeown - Q Methodology (Quantitative Applications in The Social Sciences) - SAGE Publications, Inc (2013)
121 pages
Glimmix
No ratings yet
Glimmix
244 pages
Business Analytics With Excel Mca
No ratings yet
Business Analytics With Excel Mca
6 pages
Training at Gudar Campus
No ratings yet
Training at Gudar Campus
83 pages
Unit 3 ML
No ratings yet
Unit 3 ML
28 pages
DS Fresher Resume
No ratings yet
DS Fresher Resume
3 pages
Ararsa - Adherence of IFA
No ratings yet
Ararsa - Adherence of IFA
49 pages
SYLLABUS BIOSTAT StatBio
No ratings yet
SYLLABUS BIOSTAT StatBio
16 pages
Malaria Prevention Practices and Associated Factors Among Households of Hawassa City Administration, Southern Ethiopia, 2020
No ratings yet
Malaria Prevention Practices and Associated Factors Among Households of Hawassa City Administration, Southern Ethiopia, 2020
12 pages
Chap 11 Credit Risk Individual Loans
100% (1)
Chap 11 Credit Risk Individual Loans
116 pages
Predictive Analytics With Knime Analytics For Citizen Data Scientists 1st Edition Acito Download
100% (1)
Predictive Analytics With Knime Analytics For Citizen Data Scientists 1st Edition Acito Download
78 pages

Logistic Regression

Uploaded by

Logistic Regression

Uploaded by

Logistic Regression

Why Binary Classification Works Well:

Example: Predicting Diabetes with Logistic Regression

Sigmoid Function graph

f(x)= 1/(1 + e^-x)

𝛔y/(1 - 𝛔y) = emx+c

loge(𝛔y/(1 - 𝛔y)) = mx + c loge(e)

Using the Logistic Regression Equation to Make Predictions:

 We compute the linear combination y of the features and model parameters.

Visualizing the Best-Fit Line:

Log loss and Cost function for Logistic Regression

F=−∑(i=1 to M) yilog (hθ(xi))+(1−yi)log(1−hθ(xi))

The cost function can be written as:

For Logistic Regression,

cost(hθ(x), y) = -log(hθ(x)) , when y=1

cost(hθ(x), y) = -log(1 - hθ(x)) , when y=0

 y is the actual value of the target variable,

 hθ (x) is the predicted probability that y=1 given , X and parameterized by θ.

 yi is the actual label for the i th training example.

MSE = 1/2m Σ (i=1 to m) (σ(i) - yi)2

Understanding the Confusion Matrix:

A confusion matrix is a tabular representation of the performance of a classification model that

Accuracy = (TP + TN)/(TP + TN + FP + FN)

Specificity= TN / (TN + FP)

Precision = TP/ (TP + FP)

Recall (Sensitivity or True Positive Rate):

Comparing Recall and Precision in Diabetic Prediction:

Interpreting the Confusion Matrix Metrics:

 High F1-Score: Indicates a good balance between precision and recall.

You might also like