KEMBAR78
LinearRegression Tutorial | PDF | Mean Squared Error | Regression Analysis
0% found this document useful (0 votes)
70 views40 pages

LinearRegression Tutorial

This document provides an overview of linear regression and its implementation using gradient descent, stochastic gradient descent, and the normal equation. It introduces key concepts like the cost function, updating weights, and minimizing the cost function. Code examples are provided to generate regression data and fit linear regression models from scratch using gradient descent and with scikit-learn using SGD and linear regression. Visualizations of the learned weights and predictions on the data are also presented.

Uploaded by

22520750
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views40 pages

LinearRegression Tutorial

This document provides an overview of linear regression and its implementation using gradient descent, stochastic gradient descent, and the normal equation. It introduces key concepts like the cost function, updating weights, and minimizing the cost function. Code examples are provided to generate regression data and fit linear regression models from scratch using gradient descent and with scikit-learn using SGD and linear regression. Visualizations of the learned weights and predictions on the data are also presented.

Uploaded by

22520750
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Machine Learning

Linear Regression

Quan Minh Phan & Ngoc Hoang Luong

University of Information Technology


-
Vietnam National University Ho Chi Minh City

October 7, 2022

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 1 / 40
New Packages

numpy → very frequently used in ML (python)


Link: https://numpy.org/doc/stable/user/index.html#user

>> import numpy as np

matplotlib → for visualization


Link: https://matplotlib.org/stable/tutorials/index.html

>> import matplotlib.pyplot as plt

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 2 / 40
Generate A Regression Problem

>> from sklearn.datasets import make regression


>> X, y = make regression(n samples=500, n features=1,
n informative=1, noise=25, random state=42)

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 3 / 40
Data Visualization

>> plt.scatter(X, y, facecolor=’tab:blue’, edgecolor=’white’, s=70)


plt.xlabel(’X’)
plt.ylabel(’y’)
plt.show()

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 4 / 40
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 5 / 40
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 6 / 40
Recall (Linear Regression)

Figure: The general concept of Linear Regression

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 7 / 40
Minimizing cost function with gradient descent

Cost function (Squared Error):

1 X (i)
J(w ) = (y − ŷ (i) )2 (1)
2
i

Update the weights:


wt+1 := wt + ∆w (2)
∆w = −η∇J(w ) (3)

∂J X (i)
=− (y (i) − ŷ (i) )xj (4)
∂wj
i

∂J X (i)
∆wj = −η =η (y (i) − ŷ (i) )xj (5)
∂wj
i

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 8 / 40
Minimizing cost function with gradient descent (cont.)

(
wj + η ∗ sum(y − ŷ ) j =0
wj = (i)
wj + η ∗ i (y (i) − ŷ (i) )xj
P
j ∈ [1, . . . , n]

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 9 / 40
Pseudocode of the Training Process

Algorithm 1 Gradient Descent


1: Initialize the weights, w
2: while Stopping Criteria is not satisfied do
3: Compute the output value, ŷ
4: Updates the weights
5: Compute the difference between y and ŷ
6: Update the intercept
7: Update the coefficients
8: end while

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 10 / 40
Components

Hyperparameters
eta (float): the initial learning rate
max iter (int): the maximum number of iterations
random state (int)

Parameters
w (list/array): the weight values
costs (list/array): the list containing the cost values over iterations

Methods
fit(X , y )
predict(X )

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 11 / 40
Implement (code from scratch)

class LinearRegression GD:


def init (self, eta = 0.001, max iter = 20, random state = 42):
self.eta = eta
self.max iter = max iter
self.random state = random state
self.w = None
self.costs = [ ]

def predict(self, X):


return np.dot(X, self.w[1:]) + self.w[0]

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 12 / 40
’fit’ method

def fit(self, X, y):


rgen = np.random.RandomState(self.random state)
self.w = rgen.normal(loc = 0.0, scale = 0.01, size = 1 + X.shape[1])
self.costs = [ ]
for n iters in range(self.max iter):
y pred = self.predict(X)
diff = y - y pred
self.w[0] += self.eta * np.sum(diff)
for j in range(X.shape[1]): // j ← [0, 1, ..., X.shape[1]]
delta = 0.0
for i in range(X.shape[0]): // i ← [0, 1, ..., X.shape[0]]
delta += self.eta * diff[i] * X[i][j]
self.w[j + 1] += delta
cost = np.sum(diff ** 2) / 2
self.costs.append(cost)

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 13 / 40
’fit’ method (2)

def fit(self, X, y):


rgen = np.random.RandomState(self.random state)
self.w = rgen.normal(loc = 0.0, scale = 0.01, size = 1 + X.shape[1])
self.costs = [ ]
for n iters in range (self.max iter):
y pred = self.predict(X)
diff = y - y pred
self.w[0] += self.eta * np.sum(diff)
self.w[1:] += self.eta * np.dot(X.T, diff)
cost = np.sum(diff ** 2) / 2
self.costs.append(cost)

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 14 / 40
Train Model

Gradient Descent
>> reg GD = LinearRegression GD(eta=0.001, max iter=20,
random state=42)
reg GD.fit(X, y)

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 15 / 40
Visualize the trend in the cost values (Gradient Descent)

>> plt.plot(range(1, len(reg GD.costs) + 1), reg GD.costs)


plt.xlabel(’Epochs’)
plt.ylabel(’Cost’)
plt.title(’Gradient Descent’)
plt.show()

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 16 / 40
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 17 / 40
Visualize on Data

>> plt.scatter(X, y, facecolor=’tab:blue’, edgecolor=’white’, s=70)


plt.plot(X, reg GD.predict(X), color=’green’, lw=6, label=’Gradient
Descent’)
plt.xlabel(’X’)
plt.ylabel(’y’)
plt.legend()
plt.show()

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 18 / 40
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 19 / 40
Weight values

>> w GD = reg GD.w


w GD
>> [-0.9794002, 63.18592509]

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 20 / 40
Implement (package)

Stochastic Gradient Descent


from sklearn.linear model import SGDRegressor

Hyperparameters Parameters Methods


eta0 intercept fit(X, y)
max iter coef predict(X)
random state

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 21 / 40
Implement (package) (cont.)

Normal Equation
from sklearn.linear model import LinearRegression

Parameters Methods
intercept fit(X, y)
coef predict(X)

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 22 / 40
Differences

Gradient Descent
w := w + ∆w
∆w = η i (y (i) − ŷ (i) )x i
P

Stochastic Gradient Descent


w := w + ∆w
∆w = η(y (i) − ŷ (i) )x i

Normal Equation
w = (X T X )−1 X T y

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 23 / 40
Practice (cont.)

Stochastic Gradient Descent


>> from sklearn.linear model import SGDRegressor
>> reg SGD = SGDRegressor(eta0=0.001, max iter=20,
random state=42, learning rate=’constant’)
reg SGD.fit(X, y)

Normal Equation
>> from sklearn.linear model import LinearRegression
>> reg NE = LinearRegression()
reg NE.fit(X, y)

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 24 / 40
Weight Values Comparisons

Gradient Descent (ours)


>> w GD = reg GD.w
w GD
>> [-0.9794002, 63.18592509]

Stochastic Gradient Descent


>> w SGD = np.append(reg SGD.intercept , reg SGD.coef )
w SGD
>> [-1.02681553, 63.08630288]

Normal Equation
>> w NE = np.append(reg NE.intercept , reg NE.coef )
w NE
>> [-0.97941333, 63.18605572]
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 25 / 40
Visualize on Data (all)

>> plt.scatter(X, y, facecolor=’tab:blue’, edgecolor=’white’, s=70)


plt.plot(X, reg GD.predict(X), color=’green’, lw=6, label=’Gradient
Descent’)
plt.plot(X, reg SGD.predict(X), color=’black’, lw=4,
label=’Stochastic Gradient Descent’)
plt.plot(X, reg NE.predict(X), color=’orange’, lw=2, label=’Normal
Equation’)
plt.xlabel(’X’)
plt.ylabel(’y’)
plt.legend()
plt.show()

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 26 / 40
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 27 / 40
Performance Evaluation

Mean Absolute Error (MAE)

1 X (i)
MAE (y , ŷ ) = |y − ŷ (i) | (6)
n
i

Mean Squared Error (MSE)

1 X (i)
MSE (y , ŷ ) = (y − ŷ (i) )2 (7)
n
i

R-Squared (R2)
P (i)
2 (y − ŷ (i) )2
R (y , ŷ ) = 1 − Pi (i) − y )2
(8)
i (y

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 28 / 40
Performance Evaluation

>> from sklearn.metrics import mean absolute error as MAE


from sklearn.metrics import mean squared error as MSE
from sklearn.metrics import r2 score as R2

>> y pred GD = reg GD.predict(X)

>> y pred SGD = reg SGD.predict(X)

>> y pred NE = reg NE.predict(X)

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 29 / 40
Performance Evaluation (cont.)

Mean Absolute Error


>> print(’MAE of GD:’, round(MAE(y, y pred GD), 6))
print(’MAE of SGD:’, round(MAE(y, y pred SGD), 6))
print(’MAE of NE:’, round(MAE(y, y pred NE), 6))

Mean Squared Error


>> print(’MSE of GD:’, round(MSE(y, y pred GD), 6))
print(’MSE of SGD:’, round(MSE(y, y pred SGD), 6))
print(’MSE of NE:’, round(MSE(y, y pred NE), 6))

R 2 score
>> print(’R2 of GD:’, round(R2(y, y pred GD), 6))
print(’R2 of SGD:’, round(R2(y, y pred SGD), 6))
print(’R2 of NE:’, round(R2(y, y pred NE), 6))

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 30 / 40
Run Gradient Descent with lr = 0.005

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 31 / 40
Polynominal Regression

Example
X = [258.0, 270.0, 294.0, 320.0, 342.0, 368.0, 396.0, 446.0, 480.0, 586.0]
y = [236.4, 234.4, 252.8, 298.6, 314.2, 342.2, 360.8, 368.0, 391.2, 390.8]

>> X = np.array([258.0, 270.0, 294.0, 320.0, 342.0, 368.0, 396.0, 446.0,


480.0, 586.0])[:, np.newaxis]
y = np.array([236.4, 234.4, 252.8, 298.6, 314.2, 342.2, 360.8, 368.0,
391.2, 390.8])

>> plt.scatter(X, y, label=’Training points’)


plt.xlabel(’X’)
plt.ylabel(’y’)
plt.legend()
plt.show()

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 32 / 40
Visualize data

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 33 / 40
Experiment with Linear Regression

>> from sklearn.linear model import LinearRegression


lr = LinearRegression()
lr.fit(X, y)

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 34 / 40
Experiment with Linear Regression (cont.)

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 35 / 40
Experiment with Polynominal Regression

Syntax
from sklearn.preprocessing import PolynomialFeatures

>> from sklearn.preprocessing import PolynomialFeatures


quadratic = PolynomialFeatures(degree=2)
X quad = quadratic.fit transform(X)
pr = LinearRegression()
pr.fit(X quad, y)

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 36 / 40
Experiment with Polynominal Regression (cont.)

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 37 / 40
>> X test = np.arange(250, 600, 10)[:, np.newaxis]

>> y pred linear = lr.predict(X test)


y pred quad = pr.predict(quadratic.fit transform(X test))

>> plt.scatter(X, y, label=’Training points’)


plt.xlabel(’X’)
plt.ylabel(’y’)
plt.plot(X test, y pred linear, label=’Linear fit’, c=’black’)
plt.plot(X test, y pred quad, label=’Quadratic fit’, c=’orange’)
plt.legend()
plt.show()

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 38 / 40
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 39 / 40
Practice

Dataset: ’Boston Housing’ (housing.csv) (14 attributes: 13


independent variables + 1 target variable)

File: boston housing.iypnb

Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 40 / 40

You might also like