KEMBAR78
Linear_Regression_Concept_in_DataScience | PPTX
LINEAR
REGRESSION
Aakash S 310623104002
Aditya Vaibhav J 310623104006
Ajay S 310623104009
Akash A 310623104010
Bharath D 310623104029
Chezhian V 310623104032
Subject:Foundations of Data Science
Class:II CSE-A
Date:14/08/2024
Mathematical Foundation of Linear regression
01 Introduction
Model training and evaluation
Table of Content
03
02
04
Data preprocessing and regression metrics
05 Applications and Limitation
Introduction
Linear regression is a statistical method used to model the relationship between a
dependent variable and one or more independent variables. It involves fitting a
straight line through a set of data points to predict the value of one variable based on
the other. Linear regression is also a type of machine-learning algorithm more
specifically a supervised machine-learning algorithm that learns from the labelled
datasets
Mathematical
Foundation
Univariate Linear Regression
Univariate linear regression involves a single independent variable. The relationship between this
single independent variable and the dependent variable is modeled with a linear equation. It’s the
simplest form of linear regression.
Equation:
Multivariate Linear Regression
Multivariate linear regression extends univariate linear regression to multiple independent
variables. It models the relationship between two or more independent variables and the
dependent variable.
Equation:
DATA PREPROCESSING :
1. Data collection.
2. Data integration
3. Data Cleaning
i. Handling Missing data
ii. Handling outliers
iii. Residual analysis
4. Data transformation
Model Training and Evaluation
➢ Splitting data
○ Training set (75%)
○ Testing set (25%)
➢ Cross-validation
■ k-Fold Cross-Validation
■ Leave-One-Out Cross-Validation (LOOCV)
➢ Understanding Model Coefficients
Eg- y=β0​+ β1​
x1​+ β2​
x2​
➢ Analyzing Performance
○ Overfitting
○ Underfitting
➢ Evaluation metrics
■ Mean absolute error
■ Mean squared error
■ R-squared
■ Root Mean squared error
REGRESSION METRICS:
1.Mean absolute error(MAE):
2.Mean Squared error(MSE):
3.R-squarred:
4.Root mean squared error:
Implementing Linear Regression in
Python
1. Importing the libraries and dataset
2. Data Preprocessing
Now that we have imported the libraries, we will perform data preprocessing.
3. Finding the coefficient
Now, the task is to find a line that fits best in the above scatter plot so
that we can predict the response for any new feature values
Estimating Coefficients Function Plotting Regression Line Function
Main Function
plot_regression_line(x, y, b)
OUTPUT:
Assumptions of Simple Linear
Regression
1. Linear relationship: There exists a linear relationship
between the independent variable, x, and the dependent
variable, y.
2. Independence: The residuals are independent. There is no
correlation between consecutive residuals in time series
data.
3. Homoscedasticity: The residuals have constant variance
at every level of x.
4. Normality: The residuals of the model are normally
Applications of Linear Regression
Sales Forecasting
Predict future sales based on historical
sales data, marketing spend, and other
relevant factors.
Financial Modeling
Estimate future stock prices or asset
values using financial indicators and
economic data.
Customer Segmentation
Identify customer groups with similar
spending patterns or preferences.
Behavioral Trends: Examine
trends like response to
promotions to refine segments
and improve engagement.
Applications in Business and Finance
Linear regression is a versatile tool that has numerous applications in business and finance. It is used
to analyze trends, make forecasts, and improve decision-making.
Financial Forecasting
Predict future stock prices, interest
rates, or economic growth.
Marketing Analytics
Analyze the effectiveness of
advertising campaigns and
customer segmentation.
Sales Forecasting
Predict future sales based on
historical data, economic indicators,
and marketing campaigns.
Risk Management
Estimate the probability of
default on loans or other
financial instruments.
Limitations and Considerations
While linear regression is a powerful tool, it has some limitations that
should be considered when applying it.
Non-linear Relationships
Linear regression is not suitable for
modeling non-linear relationships
between variables.
Data Quality
The quality of the data used for
regression analysis is crucial.
Outlier Impact
Outliers can have a significant impact
on the regression results.
Overfitting
A model that is too complex may
overfit the data, leading to poor
predictions on new data.
Conclusion
Linear regression models relationships between variables with simplicity and clarity. It is a
cornerstone of data science and machine learning, providing a straightforward method for predictive
modeling and relationship analysis. Its applications span various fields, including finance and
healthcare, making it invaluable for deriving insights and making informed decisions.
Riddles
1. I draw a line, from
low to high, To predict
the points that touch
the sky.
Ans: Least Squares
2.I tell you how well your line
does fit,
If you should celebrate or throw a
fit.
I range from zero to one,
What is my name, under the
regression sun?
Ans : R-squared
Find the odd one out:
A) linear relationship
B) Independence
C) Handling outliers
D) Normality
CREDITS: This presentation template was
created by Slidesgo, and includes icons by
Flaticon, and infographics & images by
Freepik
Thank
You

Linear_Regression_Concept_in_DataScience

  • 1.
    LINEAR REGRESSION Aakash S 310623104002 AdityaVaibhav J 310623104006 Ajay S 310623104009 Akash A 310623104010 Bharath D 310623104029 Chezhian V 310623104032 Subject:Foundations of Data Science Class:II CSE-A Date:14/08/2024
  • 2.
    Mathematical Foundation ofLinear regression 01 Introduction Model training and evaluation Table of Content 03 02 04 Data preprocessing and regression metrics 05 Applications and Limitation
  • 3.
    Introduction Linear regression isa statistical method used to model the relationship between a dependent variable and one or more independent variables. It involves fitting a straight line through a set of data points to predict the value of one variable based on the other. Linear regression is also a type of machine-learning algorithm more specifically a supervised machine-learning algorithm that learns from the labelled datasets
  • 4.
    Mathematical Foundation Univariate Linear Regression Univariatelinear regression involves a single independent variable. The relationship between this single independent variable and the dependent variable is modeled with a linear equation. It’s the simplest form of linear regression. Equation:
  • 6.
    Multivariate Linear Regression Multivariatelinear regression extends univariate linear regression to multiple independent variables. It models the relationship between two or more independent variables and the dependent variable. Equation:
  • 8.
    DATA PREPROCESSING : 1.Data collection. 2. Data integration 3. Data Cleaning i. Handling Missing data ii. Handling outliers iii. Residual analysis 4. Data transformation
  • 9.
    Model Training andEvaluation ➢ Splitting data ○ Training set (75%) ○ Testing set (25%) ➢ Cross-validation ■ k-Fold Cross-Validation ■ Leave-One-Out Cross-Validation (LOOCV) ➢ Understanding Model Coefficients Eg- y=β0​+ β1​ x1​+ β2​ x2​
  • 10.
    ➢ Analyzing Performance ○Overfitting ○ Underfitting ➢ Evaluation metrics ■ Mean absolute error ■ Mean squared error ■ R-squared ■ Root Mean squared error
  • 11.
    REGRESSION METRICS: 1.Mean absoluteerror(MAE): 2.Mean Squared error(MSE): 3.R-squarred: 4.Root mean squared error:
  • 12.
    Implementing Linear Regressionin Python 1. Importing the libraries and dataset 2. Data Preprocessing Now that we have imported the libraries, we will perform data preprocessing. 3. Finding the coefficient Now, the task is to find a line that fits best in the above scatter plot so that we can predict the response for any new feature values
  • 13.
    Estimating Coefficients FunctionPlotting Regression Line Function Main Function plot_regression_line(x, y, b)
  • 14.
  • 15.
    Assumptions of SimpleLinear Regression 1. Linear relationship: There exists a linear relationship between the independent variable, x, and the dependent variable, y. 2. Independence: The residuals are independent. There is no correlation between consecutive residuals in time series data. 3. Homoscedasticity: The residuals have constant variance at every level of x. 4. Normality: The residuals of the model are normally
  • 16.
    Applications of LinearRegression Sales Forecasting Predict future sales based on historical sales data, marketing spend, and other relevant factors. Financial Modeling Estimate future stock prices or asset values using financial indicators and economic data. Customer Segmentation Identify customer groups with similar spending patterns or preferences. Behavioral Trends: Examine trends like response to promotions to refine segments and improve engagement.
  • 17.
    Applications in Businessand Finance Linear regression is a versatile tool that has numerous applications in business and finance. It is used to analyze trends, make forecasts, and improve decision-making. Financial Forecasting Predict future stock prices, interest rates, or economic growth. Marketing Analytics Analyze the effectiveness of advertising campaigns and customer segmentation. Sales Forecasting Predict future sales based on historical data, economic indicators, and marketing campaigns. Risk Management Estimate the probability of default on loans or other financial instruments.
  • 18.
    Limitations and Considerations Whilelinear regression is a powerful tool, it has some limitations that should be considered when applying it. Non-linear Relationships Linear regression is not suitable for modeling non-linear relationships between variables. Data Quality The quality of the data used for regression analysis is crucial. Outlier Impact Outliers can have a significant impact on the regression results. Overfitting A model that is too complex may overfit the data, leading to poor predictions on new data.
  • 19.
    Conclusion Linear regression modelsrelationships between variables with simplicity and clarity. It is a cornerstone of data science and machine learning, providing a straightforward method for predictive modeling and relationship analysis. Its applications span various fields, including finance and healthcare, making it invaluable for deriving insights and making informed decisions.
  • 20.
    Riddles 1. I drawa line, from low to high, To predict the points that touch the sky.
  • 21.
  • 22.
    2.I tell youhow well your line does fit, If you should celebrate or throw a fit. I range from zero to one, What is my name, under the regression sun?
  • 23.
  • 24.
    Find the oddone out: A) linear relationship B) Independence C) Handling outliers D) Normality
  • 25.
    CREDITS: This presentationtemplate was created by Slidesgo, and includes icons by Flaticon, and infographics & images by Freepik Thank You