Universiti Teknologi PETRONAS
Mechanical Engineering
MDB3053 Numerical Methods
Curve Fitting Techniques
Chapter 17 in Textbook
1
Lesson Outcomes
By the end of the lesson, students should be able to
• to perform curve fitting numerically on the given data
• study 2 types curve-fitting techniques:
i. Regression you get general trend of data
ii. Interpolation you get a curve that connects all data
points
f(x) ??
Regression Interpolation
2
CURVE FITTING TECHNIQUES
1) Least-Squares Regression
• Data is not accurate/precise and exhibits a significant
degree of scatter e.g. experimental data
• Strategy: to use a single curve that best fits the general
trend of the data . Application : Trend analysis
• The best-fit line can be straight or polynomial and does not
necessarily passes through individual points.
y=a0+a1x
3
CURVE FITTING TECHNIQUES
2) Interpolation
• Data is very precise/accurate
• Strategy: to construct a curve
that passes through each of
the discrete points
• Application: to estimate any
intermediate values between
precise data points
4
1) LEAST-SQUARES REGRESSION
•This method obtains the best-fit curve by minimizing
errors between the data points (true measured value) and
the fitting curve (approximation model).
• Types of Least-Squares Regression method:
1. Linear Regression : y = f(x) = a0 + a1x
2. Polynomial Regression :
y = f(x) = a0 + a1x + a2x2 + ……. + amxm
*Nonlinear Regression : when function is nonlinear
f (x) a0 1 e a1x
5
LINEAR REGRESSION
•Strategy: Fitting a best-fit straight (linear) line to a set of
paired data points: (x1, y1), (x2, y2),…(xn, yn) where x is the
independent variable, and y is the dependent variable.
• The linear equation model for the straight line is
y a0 a1 x e
e= residual / error
a0, a1 = coefficients
•The best-fit line are determined by minimizing the sum of
the squares of the error, Sr
6
BEST-FIT CRITERIA
• How do we know which line is the best-fit line?
OR
• To get the “best-fit” line, use the method of least squares
i.e. minimize the sum of squares of the error, e2 between
the data points yi and the approximate model.
y y a a x 2
n n n
Find the S ei2 2 y
r i i,model i 0 1 i
least error: i 1 i 1 i1
Sum of squares of the errors (or residuals)
7
LINEAR REGRESSION
•To get minimum Sr , differentiate Sr with respect to each
coefficients: a0 and a1, and set to zero:
S r n
coefficient a0: 2 yi a0 a1 xi 0
a0 i 1
S r n
coefficient a1: 2 yi a0 a1 xi xi 0
a1 i 1
Rearranging to LAE:
a a x y
0 1 i i To solve for
a x a x y x
0
2
1 i i i
unknown a0 , a1
a na
0 0
8
LINEAR REGRESSION
•LAE in matrix form [X][a]=[Y]:
n
x a y
i 0 i
x a x y
i
x
2
i 1 i i
Solving for a0 and a1 (need to find inverse):
1
a0 n
x y
i i
x
2
1
a i x i x y i i
Or use Gauss elimination
na0 xi a1 yi
a0 y a1 x
9
LINEAR REGRESSION
• Follow formulas
n xi yi xi yi
a1
n x xi
2 2
coefficient a1: i
substitute
coefficient a0: a0 y a1 x
mean values of xi and yi
where mean y
y i
and x
x i
, i 1,2,..., n
n n
Therefore, the linear regression line is: y = a0 + a1x 1
10
ERROR QUANTIFICATION
• To find the accuracy of the best-fit least-squares line, we
compute the following error quantification parameters:
1) Sum of the squares of the deviation of data around the mean, St
n
St yi y
Sum squared 2
St
total error
i 1
Calculate standard deviation,
s.d
n 1
2) Sum of the squares of the errors/residuals, Sr
y
n n
Sum squared
Sr ei2 a0 a1 xi
2
i
regression error
i 1 i 1
3) Standard error of the regression, Sy/x : is an absolute measure
of spread of data around the regression line.
Sr
Sy/x
n2
11
ERROR QUANTIFICATION
• The difference between St and Sr provides a measure of the
accuracy of regression or “goodness of fit”.
• To get “goodness of fit”, we calculate R-squared value or
Coefficient of Determination which is the relative measure
of the percentage of how many data points fall on the
regression line. n
S r yi a0 a1 xi
2
R-squared r 2 S t S r i 1
value, r2 St
St yi y
2
• Example: if r2 = 0.8 means that 80% of variability of the data fit the
regression model. r2 = 1 means perfect fit. r2 = 0 means zero
correlation. A good fit should be r2 > 0.8
• Two most important parameters: R-squared value r2 and standard
error of the regression: Sy/x
12
CLASS ACTIVITY #1
Use linear least-squares regression to fit the data:
x 1 2 3 4 5 n xi yi xi yi
a1
n xi2 xi
2
y 0.7 2.2 2.8 4.4 4.9
a0 y a1 x
Find answer: a0= –0.18, a1=1.06
n
S r yi a0 a1 xi
2
(a) Least-squares equation y = a0 + a1x
i 1
(b) St and Sr
S t yi y
2
(c) Standard error of the estimates, Sy/x
(d) R-squared value, r2 Sr
Sy/x
n2
St S r
SOLUTION:
x 1 2 … … … x r
St
by creating table y 0.7 2.2 … … … y
calculatex , y xy 0.7 4.4 … … … xy
x2 1 4 … … … x2
13
SOLUTION
xi yi xiyi xi2 St = (yi - y )2 Sr = (yi - a0 -a1xi)2
1 0.7 0.7 1 5.29 0.0324
2 2.2 4.4 4 0.64 0.0676
3 2.8 8.4 9 0.04 0.04
4 4.4 17.6 16 1.96 0.1156
5 4.9 24.5 25 3.61 0.0484
SUM 15 15 55.6 55 11.54 0.304
Σxi Σyi Σxiyi Σxi2 Σ(yi - y )2 Σ(yi - a0 -a1xi)2
ybar = 3
xbar = 3
a1 = 1.06 y = –0.18 + 1.06x
a0 = -0.18 r2 = 0.9737
St = 11.54
Sr = 0.304
Sx/y = 0.3183
r= 0.9867
r2 = 0.9737
CLASS ACTIVITY #2
Use linear least-squares regression to fit a straight line to the
x and y values below: n xy x
a1
y
i i i i
n x x
2 2
i i
x 1 3 5 7 10
a0 y a1 x
y 4 5 6 5 8
n
S r yi a0 a1 xi
2
Find i 1
S t yi y
2
(a) Least-squares equation y = a0 +a1x
(b) St and Sr Sy/x
Sr
(c) Standard error of the estimates, S y/x n2
(d) R-squared value, r2 St S r
r
St
Answer: intercept a0=3.639, slope a1=0.377, r2= 0.754, Sy/x = 0.8684 15
CLASS ACTIVITY #3
The data below was obtained from a creep test performed on a metal
specimen. Table shows the increase in strain (%) over time (min) while a
constant load was applied to a test specimen. Using linear regression
method, find the equation of the line which best fits the data.
Time, min 0.083 0.584 1.084 1.585 2.085 2.586 3.086
Strain, % 0.099 0.130 0.160 0.184 0.204 0.229 0.252
n xi yi xi yi
a1
n xi2 xi
2
Answer: a0=0.1, a1=0.05, r2=0.9951 a0 y a1 x
16
17
Common errors in data analysis
Source: explainxkcd.com
18
Cont
19
POLYNOMIAL REGRESSION
• Sometimes, data is poorly represented by a straight line.
Therefore, a curve is better suited to fit the data.
• The least-squares procedure can be easily extended for a
higher order polynomials
• Ex: for a second order polynomial
linear
y a0 a1 x a2 x e
2
n
S r yi a0 a1 xi a x 2 2
2 i
i 1
2nd order
Chap17/18
20
POLYNOMIAL REGRESSION
• ‘Best-fit’ curve is obtained by minimizing the sum of square of
the residuals. Taking the derivatives and setting them to zero
n
S r yi a0 a1 xi a x 2 2
2 i
i 1
y a
coefficient a 0: S r
2 a x a 2 i 0
x 2
a0
i 0 1 i
coefficient a 1: S r
2 xi yi a0 a1 xi a2 xi2 0
a1
2 x y a a x a x 0
coefficient a 2: S r 2 2
a2
i i 0 1 i 2 i
21
POLYNOMIAL REGRESSION
• By rearranging the variables ai
n a0
x a x a y
i 1 i
2
2 i
x a x a x a x y
i 0 i
2
1
3
i 2 i i
x a x a x a x y
i
2
0
3
i 1 i
4
2 i
2
i
• So we have 3 LAE with 3 unknowns a0, a1 and a2 to be solved
simultaneously using methods like Gauss elimination.
22
POLYNOMIAL REGRESSION
• OR in a matrix form: AX = B
2
n
yi
n n
n x i xi
n i 1 i 1
0
a n
i 1
xy
1 i i
n n
xi i
x 2
xi3 a
i 1
i 1 i 1 i 1
n n n a2 n
xi2 i
x 3
x 4
x 2
y
i 1
i
i i
i 1 i 1 i 1
• Then, solve for a0, a1, a2 by using Gauss elimination method
23
EXAMPLE
• Let’s say from the given data, we derive the L.A.E as
6 15 55 a0 152.6
15 55 225 a 585.6
1
55 225 979 a2 2488.8
By using Gauss elimination, we obtain:
a0 = 2.47857, a1=2.35929, a2 = 1.86071
Hence, the least-squares quadratic equation:
y = 2.47857 + 2.35929x + 1.86071x2
(see Example 17.5 in the textbook)
What about standard error and correlation coefficient?
24
SOLUTION
25
SUMMARY
• 2 Techniques to do curve-fitting: Regression and
Interpolation.
• Types of Least-squares Regression:
• Linear Regression
• Polynomial Regression
We have covered Chapter 17 in textbook
26