KEMBAR78
Linear Regression | PDF | Linear Regression | Regression Analysis
0% found this document useful (0 votes)
12 views35 pages

Linear Regression

Linear

Uploaded by

23cse077
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views35 pages

Linear Regression

Linear

Uploaded by

23cse077
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

What & Why

What is Regression?
Formulation of a functional relationship between a set of Independent or
Explanatory variables (X’s) with a Dependent or Response variable (Y).
Y = f(X)

Why Regression?
Knowledge of Y is crucial for decision making.
• Will he/she buy or not?
• Shall I offer him/her the loan or not?
• ………
X is available at the time of decision making and is related to Y, thus making
it possible to have a prediction of Y.

1
Types of Regression

Continuous Binary (0/1)

E.g., Sales Volume, Claim


E.g., Buy/No-Buy, Survive/Not-
Amount, % of sales growth
Survive, Win/Loss etc
etc.

Ordinary Least Square Logistic Regression


(OLS) Regression

2
Intro to Regression Analysis
• Regression analysis is used to:
• Predict the value of a
dependent variable based on
the value of at least one
independent variable
• Explain the impact of changes
in an independent variable on
the dependent variable
• Dependent variable: the
variable we wish to explain,
usually denoted by Y.
• Independent variable: the
variable used to explain the
dependent variable. Usually
denoted by X.

3
Regression Example

Predict the fitness of a


person based on one or
more parameters.

4
Regression Example

5
Simple Linear Regression Model

• Only one independent


variable, x
• Relationship between x
and y is described by a
linear function
• Changes in y are
assumed to be caused
by changes in x

6
Assumptions for Simple Linear Regression

E(ε) = 0

7
Assumptions for Multiple Regression

8
Assumptions for Multiple Regression

2 ଶ
 

9
Assumptions for Multiple Regression

[E(ε i ε j ) = 0, j ≠ i]

10
Equations for Regression

11
Simple Linear Regression Model

12
Beta Zero

13
Beta One

1 unit

14
Error Term /Residual

15
Regression Line Equation

16
The Simple Linear Regression Model

17
The Multiple Linear Regression Model

18
Model for Multiple Regression

19
Types of Regression Relationships
Negative Linear Relationship Relationship NOT Linear

Positive Linear Relationship No Relationship

20
Population & Sample Regression Models

Population Random Sample

Unknown
Relationship☺
Yi = β 0 + β 1X i + ε i ☺

☺ ☺


21
Population Linear Regression

Y Y = β 0 + β1X + u

Slope = β1
ui
Predicted Value Random Error for this x value
of Y for Xi

Intercept = β0 Individual
person's marks

xi X
22
Population Regression Function

Random
Dependent Population y Population Slope Independent Error term, or
Variable intercept Coefficient Variable residual

Y = β 0 + β1X + u
Linear component Random Error
component

But can we actually get this equation?


If yes what all information we will need?

23
Sample Regression Function

Y y = b 0 + b1 x + e
Observed Value
of y for xi

Slope = β1
ei
Predicted Value Random Error for this x value
of Y for Xi

Intercept = β0

xi X
24
Sample Regression Function

Estimate of the Estimate of the


regression intercept regression slope
Independent
variable

y i = b 0 + b1x + e Error term

Notice the similarity with the Population Regression Function

Can we do something of the error term?

25
The Error Term (Residual)

• Represents the influence of all the variable which


we have not accounted for in the equation
• It represents the difference between the actual y
values as compared the predicted y values from the
Sample Regression Line
• Wouldn't it be good if we were able to reduce this
error term?
• By the way - what are we trying to achieve by
Sample Regression?

26
How Well A Model Fits the Data

27
Comparing the Regression Model to a Baseline Model

28
Comparing the Regression Model to a Baseline Model

29
OLS Regression Properties

• The sum of the residuals from the least squares regression line is
zero. ∑ ( y − yˆ ) = 0
• The sum of the squared residuals is a minimum.
Minimize( ∑ ( y − yˆ ) 2
)

• The simple regression line always passes through the mean of


the y variable and the mean of the x variable

• The least squares coefficients are unbiased estimates of β0 and


β1

30
Limitations of Regression Analysis

• Parameter Instability - This happens in situations where


correlations change over a period of time. This is very
common in financial markets where economic, tax,
regulatory, and political factors change frequently.
• Public knowledge of a specific regression relation may
cause a large number of people to react in a similar fashion
towards the variables, negating its future usefulness.
• If any of the regression assumptions are violated,
predicted dependent variables and hypothesis tests will not
hold valid.

31
General Multiple Linear Regression Model
• In simple linear regression, the dependent variable was assumed to be
dependent on only one variable (independent variable)
• In General Multiple Linear Regression model, the dependent variable derives its
value from two or more than two variable.
• General Multiple Linear Regression model take the following form:
Yi = b0 + b1 X 1i + b2 X 2 i + ......... + bk X ki + ε i
where:
Yi = ith observation of dependent variable Y
Xki = ith observation of kth independent variable X
b0 = intercept term
bk = slope coefficient of kth independent variable
εi = error term of ith observation
n = number of observations
k = total number of independent variables
32
Estimated Regression Equation

• As we calculated the intercept and the slope coefficient in case of


simple linear regression by minimizing the sum of squared errors,
similarly we estimate the intercept and slope coefficient in multiple
linear regression. n
• Sum of Squared Errors
estimated.
∑ i
ε 2

i =1
is minimized and the slope coefficient is

• The resultant estimated equation becomes:


∧ ∧ ∧ ∧ ∧
Yi = b0 + b1 X 1i + b2 X 2 i + ......... + bk X ki
• Now the error in the ith observation can be written as:

 ∧ ∧ ∧ ∧

ε i = Yi − Yi = Yi −  b0 + b1 X 1i + b2 X 2 i + ......... + bk X ki 
 

33
Assumptions of Multiple Regression Model

• There exists a linear relationship between the dependent and


independent variables.
• The expected value of the error term, conditional on the
independent variables is zero.
• The error terms are homoskedastic, i.e. the variance of the
error terms is constant for all the observations.
• The expected value of the product of error terms is always
zero, which implies that the error terms are uncorrelated with
each other.
• The error term is normally distributed.
• The independent variables doesn't have any linear
relationships between each other.
34
Thank you!

You might also like