KEMBAR78
Lecture 4 | PDF | Errors And Residuals | Regression Analysis
0% found this document useful (0 votes)
16 views3 pages

Lecture 4

Lecture 4 covers linear regression, including data loading, preprocessing, feature selection, model creation, and evaluation metrics such as MSE, RMSE, and R2. It discusses supervised learning, point estimates, and methods like gradient descent and ordinary least squares (OLS) for model fitting. The lecture also emphasizes the importance of standardizing datasets and cross-validation to prevent memorization in model training.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views3 pages

Lecture 4

Lecture 4 covers linear regression, including data loading, preprocessing, feature selection, model creation, and evaluation metrics such as MSE, RMSE, and R2. It discusses supervised learning, point estimates, and methods like gradient descent and ordinary least squares (OLS) for model fitting. The lecture also emphasizes the importance of standardizing datasets and cross-validation to prevent memorization in model training.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Lecture 4

1. Linear Regression:
a. How to load data
b. Pre-process / clean data
c. How to choose features
d. Create model
e. Model evaluation metrics
i. MSE
ii. RMSE
iii. R2
iv. R2 adj (adjuster)
v. Value of the coefficients
vi. Se of the coefficients
vii. T-statistic
viii. P – value
ix. Null hypothesis
2. Supervised:
X  feature vector (x1, x2, …)
Y  label

Use X to predict Y

f(X) to predict

f(X) = w0 + w1 * x2
True regression functions are never linear

Model:
Y = B0 + B1 * X + E
In Python, use .fit() function to predict the model (substitute X to find
function f(x)

Y = B0 + B1 * X1
- Analytical: Close form
Error = ∑ ❑ Lowest value
o Residual sum of squares (RSS) = e1^2 + e2^2 + … + en^2
o Least square approach
- Numerical:
o Gradient Descent (check below)
 Linear time
 Local Minima
 Best weights / Coefficients

3. Point Estimates:
- Sales = 10000 + (1.6) * (TV) + (2.9) * (Radio)
4. The linear regression is computed as (X'X)^-1 X'Y
5. ChatGPT: too much data, billions
6. Gradient Descent Approach:
- w(0) = initial value (guess)
- w(1) = w(0) – (Learning Rate) * d(error)/dw
- Point Estimate – Best Coefficients
- Standardize the dataset
o N(0 , 1)
7. Exact Solution (Closed Form):
- Point Estimate
o Standard Error
8. Root mean square error (RMSE)
9. MAE

----------------------

Ex: House price prediction:

X1: Size

X2: Bathroom

X3: Bedroom

Mean and variance of the price of the house (predicted from x1, x2, x3)

Ex: Temperature

With temperatures, we may predict the mean, but it’s hard to predict the
variance

10. Predicted mean  Actual mean


11. Predicted var  Actual var
% Variance explained

If I show the entire dataset to the model, and test using the same data

 Memorization
 To prevent, you cross-data

Web for datasets: Auto MPG - UCI Machine Learning Repository

12. SGD:
- Evaluation Metrics:
o 1. Score (R2) value: R2 = 1 - RSS/TSS
o RSS = (y1 – y1’)2 + (y2 – y2’)2 + … + (yn – yn’)2
o TSS = (y1 – y’)2 + (y2 – y’)2 + … + (yn – y’)2
 0<x<1
- Approximate approach
- Faster
- No guarantee of best solution
- Doesn’t give many evaluation metrics
- A good idea to standardize the data
13. OLS:
- Closed form
- Exact solution
- Cubic time complexity
- Statsmodels
- Not require to standardize the data
- Diagnostics:
o Point Estimate:
 MedHouseValue: 0.0163 + 0.4416 * MedInc + …
o T-statistic = point estimate / std.error  as large as possible

You might also like