Curve Fitting
Professor Henry Arguello
Universidad Industrial de Santander
Colombia
March 12, 2018
High Dimensional Signal Processing Group
www.hdspgroup.com
henarfu@uis.edu.co
LP 304
Professor Henry Arguello Numerical methods March 12, 2018 1 / 33
Least-squares Line
Least-squares Line
Given a set of data points (x1 , y1 ), ..., (xN , yN ), where the abscissas {xk }
are distinct, one goal of numerical methods is to determine a formula
y = f (x) that relates these variables. This section emphasizes fitting the
data to linear functions of the form,
(1) y = f (x) = Ax + B
Professor Henry Arguello Numerical methods March 12, 2018 2 / 33
Least-squares Line
If all the numerical values xk , yk are known to several significant dig-
its of accuracy, then polynomial interpolation can be used successfully;
otherwise it can not. Often there is an experimental error in the mea-
surements, and although three digits are recorded for the values xk and
yk , it is realized that the true value f (xk ) satisfies
(2) f (xk ) = yk + 2ek ,
Where ek is the measurement error.
How do we find the best linear approximation of the form (1) that goes
near (not always through) the points? To answer this question, we need
to discuss the errors (also called derivations or residuals):
(3) ek = f (xk − yk ) for 1 ≤ k ≤ N.
Professor Henry Arguello Numerical methods March 12, 2018 3 / 33
Least-squares Line
There are several norms that can be used with the residuals in (3) to
measure how far the curve y = f (x) lies from the data
Professor Henry Arguello Numerical methods March 12, 2018 4 / 33
Least-squares Line
There are several norms that can be used with the residuals in (3) to
measure how far the curve y = f (x) lies from the data
(4) Maximun error: E∞(f ) = max|f (xk − yk )| 1 <= k <= N,
Professor Henry Arguello Numerical methods March 12, 2018 4 / 33
Least-squares Line
There are several norms that can be used with the residuals in (3) to
measure how far the curve y = f (x) lies from the data
(4) Maximun error: E∞(f ) = max|f (xk − yk )| 1 <= k <= N,
1 PN
(5) Average error: E1 (f ) = |f (xk − yk )|,
N k=1
Professor Henry Arguello Numerical methods March 12, 2018 4 / 33
Least-squares Line
There are several norms that can be used with the residuals in (3) to
measure how far the curve y = f (x) lies from the data
(4) Maximun error: E∞(f ) = max|f (xk − yk )| 1 <= k <= N,
1 PN
(5) Average error: E1 (f ) = |f (xk − yk )|,
N k=1
1 PN
(6) Root-Mean-Square error: E2 (f ) = ( |f (xk − yk )|2 )1/2 .
N k=1
Professor Henry Arguello Numerical methods March 12, 2018 4 / 33
Least-squares Line
Example Compare the maximum error, average error and RMS error
for the linear approximation y = f (x) = 8.61.6x to the data points
(-1,10), (0,9), (1,7), (2,5), (3,4), (4,3), (5,0) and (6,-1).
The errors are found using the values for f (xk ) and ek given in table
xk yk f (xk ) = 8.6 − 1.6xk ek e2k
-1 10.0 10.2 0.2 0.04
0 9.0 8.6 0.4 0.16
1 7.0 7.0 0.0 0.00
2 5.0 5.4 0.4 0.16
3 4.0 3.8 0.2 0.04
4 3.0 2.2 0.8 0.64
6 -1.0 -1.0 0.0 0.00
2.6 1.40
Professor Henry Arguello Numerical methods March 12, 2018 5 / 33
Least-squares Line
(7) E∞ (f ) = max{0.2, 0.4, 0.0, 0.4, 0.2, 0.8, 0.6, 0.0} = 0.8,
1
(8) E1 (f ) = (2.6) = 0.325,
8
1
(9) E2 (f ) = ( (1.4)1/2 ) ≈ 0.41833.
N
We can see that the maximum error is largest, and if one point is badly
in error, its value determines E1 (f ). The average error E1 (f ) simply
averages the absolute value of the error at the various points. It is often
used because it is easy to compute. The error E2 (f ) is often used when
the statistical nature of the error is considered.
A best-fitting line is found by minimizing one of the quantities in equa-
tions (4) through (6). Hence there are three best-fitting lines that we
could find.
Professor Henry Arguello Numerical methods March 12, 2018 6 / 33
Least-squares Line
Finding the Least-Squares line
Let {(xk , yk )}Nk=1 be a set of N points, where the abscissas {xk } are
distinct. The least-squares line y = f (x) = Ax + B is the line that
minimizes the Root-Mean-Square error E2 (f ).
The quantity E2 (f ) will be a minimum if and only if the quantity
PN
N(E2 (f ))2 = k=1 (Axk + Byk )2 is a minimum.
The latter is visualized geometrically by minimizing the sum of the squares
of the vertical distances from the points to the line.
Professor Henry Arguello Numerical methods March 12, 2018 7 / 33
Least-squares Line
Theorem: Least-Squares Line
Suppose that {(xk , yk )}Nk=1 are N points, where the abscissas {xk } are
distinct. The coefficients of the least-squares line
y = Ax + B
are the solution of the following linear system, known as the normal
equations:
P P
N 2 N PN
(10) k=1 xk A + k=1 xk B = k=1 xk , yk ,
P
N PN
k=1 xk A + NB = k=1 yk .
Professor Henry Arguello Numerical methods March 12, 2018 8 / 33
Least-squares Line
Proof. Geometrically, we start with line y = Ax + B. The vertical
distance dk from the point (xk , yk ) to the point (xk , Axk + B) on the line is
dk = |Axk + Byk |
Professor Henry Arguello Numerical methods March 12, 2018 9 / 33
Least-squares Line
PN PN
(11) E(A, B) = k=1 (Axk + B − yk )2 = 2
k=1 dk
The minimum value of E(A, B) is determined by setting the partial deriva-
tives ∂E/∂Aand ∂E/∂B equal to zero and solving these equations for A
and B. Notice that {xk } and {yk } are constants in equation and that A
and B are the variables! Hold B fixed, differentiate E(A, B) with respect
to A and get
∂E(A,B)
= Nk=1 2(Axk + B − yk )(xk ) = 2 Nk=1 (Axk2 + Bxk − yk xk )
P P
(12)
∂A
Now hold A fixed and differentiate E(A, B) with respect to B and get
∂E(A,B) PN
= k=1 2(Axk + B − yk ) = 2 Nk=1 (Axk + B − yk )
P
(13)
∂A
Professor Henry Arguello Numerical methods March 12, 2018 10 / 33
Least-squares Line
Setting the partial derivatives equal to zero in (12) and (13), use the
distributive properties of summation toPobtain
0 = k=1 (Axk + Bxk yk xk ) = A Nk=1 xk2 + B Nk=1 xk − Nk=1 yk xk
PN 2
P P
(14)
0 = Nk=1 (Axk + B − yk ) = A Nk=1 xk + NB − Nk=1 yk
P P P
(15)
Equations (14) and (15) can be rearranged in the standard form for a
system and result in the normal equations (10).
Example
Find the least-squares line for the data points given in the above ex-
ample. The sums required for the normal equations (10) are easily
obtained using the values in the table.
Professor Henry Arguello Numerical methods March 12, 2018 11 / 33
Least-squares Line
k xk yk xk2 xk yk
0 -1 10 1 -10
1 0 9 0 0
2 1 7 1 7
3 2 5 4 10
4 3 4 9 12
5 4 3 16 12
6 5 0 25 0
7
P 6 -1 36 -6
20 37 92 25
The linear system involving A and B is
92A + 20B = 25
20A + 8B = 37
The solution of the linear system is A ≈ −1.6071429 and
B ≈ 8.6428571. Therefore, the least-squares line is (see figure)
y = −1.6071429x + 8.6428571
Professor Henry Arguello Numerical methods March 12, 2018 12 / 33
Least-squares Line
Power Fit y = AxM
Some situations involve f (x) = AxM , where M is a known constant. In
these cases there is only one parameter A to be determined.
Teorem: Power Fit
Suppose that {xk , yk }M
k=1 are N points, where the abscissas are distinct.
The coefficient A of the least-squares power curve y = AxM is given by
A = ( Nk=1 xkM yk )/ Nk=1 xk2M )
P P
(16)
Using the least-squares technique, we seek a minimum of the function
E(A):
E(A) = Nk=1 (AxkM yk )2
P
(17)
Professor Henry Arguello Numerical methods March 12, 2018 13 / 33
Least-squares Line
In this case it will satisfy to solve E0 (A) = 0. The derivative is
PN PN
(18) E0 (A) = 2 M
k=1 (Axk − yk )(xkM ) = 2 2M
k=1 (Axk − xkM yk )
Hence the coefficient A is the solution of the equation
0 = A Nk=1 xk2M − Nk=1 xkM yk ,
P P
(19)
which reduces to the formula in equation (16).
Example
Students collected the experimental data in table. The relation is d =
1 2
2 gt , where d is distance in meters and t is time in seconds. Find the
gravitational constant g
Professor Henry Arguello Numerical methods March 12, 2018 14 / 33
Least-squares Line
Time, tk Distance, dk dk tk2 tk4
0.200 0.1960 0.00784 0.0016
0.400 0.7850 0.12560 0.0256
0.600 1.7665 0.63594 0.1296
0.800 3.1405 2.00992 0.4096
1.000 4.9075 4.90750 1.0000
7.68680 1.5664
The values in table are use to find the summations required in formula
(16), where the power used is M = 2.
The coefficient A = 7.68680/1.5664 = 4.9073, and we get d = 4.9073t2
and g = 2A = 9.7146m/sec2 .
Professor Henry Arguello Numerical methods March 12, 2018 15 / 33
Methods of Curve Fitting
Data Linearization Method for y = CeAx
Suppose that we are given the points (x1 , y1 ), (x2 , y2 ), ..., (xN , yN ) and
want to fit an exponential curve of the form
(1) y = CeAx
The first step is to take the logarithm of both sides:
(2) In(y) = Ax + In(C).
Then introduce the change of variables:
(3) Y = In(y), X = x, and B=ln(C).
This results in a linear relation between the new variables X and Y:
(4) Y = AX + B
Professor Henry Arguello Numerical methods March 12, 2018 16 / 33
Methods of Curve Fitting
The original points (xk , yk ) in the xy-plane are transformed into the
points (Xk , Yk ) = (xk , ln(yk )) in the XY-plane. This process is called data
linearization. Then the least-squares line (4) is fit to the points
{(Xk , Yk )}. The normal equations for finding A and B are
P P P
N 2 A+ N N
(5) k=1 kX k=1 k =
X k=1 Xk Yk ,
P
N
A + NB = Nk=1 Yk .
P
k=1 X k
After A and B have been found, the parameter C in equation (1) is com-
puted:
(6) C = eB .
Professor Henry Arguello Numerical methods March 12, 2018 17 / 33
Methods of Curve Fitting
Example
Use the data linearization method and find the exponential fit y = CeAx
for the five data points (0, 1.5), (1, 2.5), (2, 3.5), (3, 5.0), and (4, 7.5).
Apply the transformation (3) to the original points and obtain
xk yk Xk Yk = In(yk ) Xk2 X k Yk
0.0 1.5 0.0 0.405465 0.0 0.000000
1.0 2.5 1.0 0.916291 1.0 0.916291
2.0 3.5 2.0 1.252763 4.0 2.505526
3.0 5.0 3.0 1.609438 9.0 4.828314
4.0 7.5 4.0 2.014903 16.0 8.059612
10.0 6.198860 30 16.309743
These transformed points are shown in figure and exhibit a linearized
form. The equation of the least-squares line Y = AX + B for the points
in the table is in the next figure
(8) Y = 0.391202X + 0.457367
Professor Henry Arguello Numerical methods March 12, 2018 18 / 33
Methods of Curve Fitting
Figure: The transformed data points (Xk , Yk )
Calculation of the coefficients for the normal equations in (5) is shown
in table. The resulting linear system (5) for determining A and B is
(9) 30A + 10B = 16.309742
30A + 5B = 6.198860
Professor Henry Arguello Numerical methods March 12, 2018 19 / 33
Methods of Curve Fitting
The solution is A = 0.3912023 and B = 0.47367. Then C is obtained
wtih the calculation C = e0.457367 = 1.579910, and these values for A
and C are substituted into equation (1) to obtain the exponential fit
(10) y = 1.579910e0.457367 (fit by data linearization).
Professor Henry Arguello Numerical methods March 12, 2018 20 / 33
Methods of Curve Fitting
Nonlinear Least-Squares Method for y = CeAx Suppose that we are
given the points (x1, y1), (x2, y2), ..., (xN, yN) and we want to fit an
exponential curve:
(11) y = CeAx .
The nonlinear least-squares procedure requires that we find a
minimum of
E(A, B) = Nk=1 (CeAxk − yk ).
P
(12)
The partial derivatives of E(A, B) with respect to A and C are
∂E
= 2 Nk=1 (CeAxk − yk )(Cxk eAxk )
P
(13)
∂A
and
∂E
= 2 Nk=1 (CeAxk − yk )(xk eAxk ).
P
(14)
∂C
When the partial derivatives in (13) and (14) are set equal to zero and
then simplified, the resulting normal equations are
Professor Henry Arguello
PN Numerical
2Ax methods N
P Ax March 12, 2018 21 / 33
k k
Methods of Curve Fitting
Transformations for Data Linearization
The technique of data linearization has been used by scientists to fit
curves such as y = CeAx , y = Aln(x) + B, and y = A/x + B. Once the
curve has been chosen, a suitable transformation of the variables must
be found so that a linear relation is obtained. For example, we can
verify that y = D/(x + C) is transformed into a linear problem
Y = AX + B by using the change of variables (and constants)
X = xy, Y = y, C = 1/A, and D = B/A.
Professor Henry Arguello Numerical methods March 12, 2018 22 / 33
Methods of Curve Fitting
Linear Least Squares
The linear least-squares problem is stated as follows. Suppose that
N data points {(Xk , Yk )} and a set of M linear independent functions
{fj (X)} are given. We want to find M coefficients {cj } so that the function
f (x) given by the linear combination
PM
(16) f (x) = j=1 cj fj (x)
will minimize the sum of the squares of the errors:
2
N
X N
X M
X
(17) E(c1 , c2 , ..., cM ) = (f (xk )−yk )2 = cj fj (xk ) − yk .
k=1 k=1 j=1
Professor Henry Arguello Numerical methods March 12, 2018 23 / 33
Methods of Curve Fitting
For E to be minimized it is necessary that each partial derivative be
zero (i.e, ∂E/ci = 0 for i = 1, 2, ..., M), and this results in the system of
equations
N
X XM
(18) cj fj (xk ) − yk (fi (xk )) = 0 for i = 1, 2, ..., M.
k=1 j=1
Interchanging the order of the summations in (18) will produce an M ×
M system of linear equations where the unknowns are the coefficients
{cj }. They are called the normal equations:
M N
! N
X X X
(19) fi (xk )fi (xk ) cj = fi (xk )yk for i = 1, 2, ..., M.
j=1 k=1 k=1
Professor Henry Arguello Numerical methods March 12, 2018 24 / 33
Methods of Curve Fitting
Matrix Formulation
Al though (19) is easily recognized as a system of M linear equations
in M unknowns, one must be clever so that wasted computations are
not performed when writing the system in matrix notation. The key is to
write dawn the matrices F and F’ as follows:
f1 (x1 ) f2 (x1 ) ··· fM (x1 ) f1 (x1 ) f1 (x2 ) · · · f1 (xN )
f1 (x2 ) f2 (x2 ) ··· fM (x2 ) f2 (x1 ) f2 (x2 ) · · · f2 (xN )
F = . .. , F’ = .. .. .
.. ..
.. . . . . .
f1 (xN ) f2 (xN ) · · · fM (xN ) fM (x1 ) fM (x2 ) · · · fM (xN )
Professor Henry Arguello Numerical methods March 12, 2018 25 / 33
Methods of Curve Fitting
Consider the product of F and the column matrix Y :
f1 (x1 ) f1 (x2 ) · · · f1 (xN ) y1
f2 (x1 ) f2 (x2 ) · · · f2 (xN ) y2
(20) F’Y = .
. .. .. ..
. . . .
fM (x1 ) fM (x2 ) · · · fM (xN ) yN
The element in the ith row of the product F’Y in (20) is the same as the
ith element in the column matrix in equation (19); that is,
PN 0
(21) k=1 fi (xk)yk = rowi − F’ . y1 y2 · · · yN
Professor Henry Arguello Numerical methods March 12, 2018 26 / 33
Methods of Curve Fitting
Now consider the product F’F , which is an M × M matrix. The element
in the ith row and jth column of F?F is the coefficient of cj in the ith row
in equation (19); that is,
PN
(22) k=1 fi (xk )fj (xk ) = fi (x1 )fj (x1 ) + fi (x2 )fj (x2 ) + · · · + fi (xN )fj (xN ).
When M is small, a computationally efficient way to calculate the linear
least-squares coefficients for (16) is to store the matrix F,compute F 0 F,
and F 0 Y and then solve the linear system
(23) F’FC = F’Y for the coefficient matrix C
Professor Henry Arguello Numerical methods March 12, 2018 27 / 33
Methods of Curve Fitting
Polynomial Fitting
When the foregoing method is adapted to using the functions {fi (x) =
xj−1 } and the index of summation ranges from j = 1 to j = M + 1, the
function f (x) will be a polynomial of degree M:
(24) f (x) = c1 + c2 x + c3 x2 + ... + cM + 1xM
Least-Squares Parabola
Suppose that {(Xk, Yk)}Nk=1 are N points, where the abscissas are
distinct. The coefficients of the least-squares parabola
(25) y = f (x) = Ax2 + Bx + C
Professor Henry Arguello Numerical methods March 12, 2018 28 / 33
Methods of Curve Fitting
are the solution values A, B and C of the linear system
P P P
N 4 A+ N 3 B+ N 2 C =
PN 2
x
k=1 k x
k=1 k x
k=1 k k=1 yk xk ,
P P P
N 3 N 2 N PN
(26) k=1 xk A+ k=1 xk B+ k=1 xk C= k=1 yk xk ,
P P
N 2 N PN
k=1 xk A+ k=1 xk B + NC = k=1 yk
Proof. The coefficients A, B, and C will minimize the quantity:
E(A, B, C) Nk=1 (Axk2 + Bxk + C − yk )2 .
P
(27)
Professor Henry Arguello Numerical methods March 12, 2018 29 / 33
Methods of Curve Fitting
The partial derivatives ∂E/∂A, ∂E/∂B, and ∂E/∂C must be zero. This
results in (28)
0 = ∂E(A, B, C)/∂A = 2 Nk=1 (Axk2 + Bxk ∗ C − yk )(xk2 ),
P
0 = ∂E(A, B, C)/∂B = 2 Nk=1 (Axk2 + Bxk ∗ C − yk )(xk ),
P
0 = ∂E(A, B, C)/∂B = 2 Nk=1 (Axk2 + Bxk ∗ C − yk )(1).
P
Using the distributive property of addition, we can move the values A,
B, and C outside the summations in (28) to obtain the normal
equations that are given in (28).
Professor Henry Arguello Numerical methods March 12, 2018 30 / 33
Methods of Curve Fitting
Polynomial Wiggle
It is tempting to used a least-squares polynomial to fit data that are non-
lineal. But if the data do not exhibit a polynomial nature, the resulting
curve may exhibit large oscillations. This phenomenum, called polyno-
mial wiggle, becomes more pronounced with higher-degree polynomi-
als. For this reason we seldom use a polynomial of degree 6 or above
unless it is known that the true function we are working with is a poly-
nomial.
Professor Henry Arguello Numerical methods March 12, 2018 31 / 33
Methods of Curve Fitting
For example let f (x) = 1.44/x2 + 0.24x be used to generate the six
data points (0.25,23.1), (1.0,1.68), (1.5,1.0), (2.0,0.84), (2.4,0.826), and
(5.0,1.2576). The result of curve fitting with the least-square polynomi-
als
P2 (x) = 22.93 − 16.96x + 2.553x2 ,
P3 (x) = 33.04 − 46.51x + 19.51x2 − 2.2963 ,
P4 (x) = 39.92 − 80.93x + 58.39x2 − 17.15x3 + 1.680x4 ,
P5 (x) = 46.02 − 118.1x + 119.4x2 − 57.51x3 + 13.03x4 − 1.085x5
is shown in Figure through (d). Notice that P3 (x), P4 (x), and P5 (x), ex-
hibit a large wiggle in the interval [2,5]. Even though P5 (x) goes through
the 6 points, it produces the worst fit. If we must fit a polynomial to this
data, P2 (x) should be the choice.
Professor Henry Arguello Numerical methods March 12, 2018 32 / 33
Methods of Curve Fitting
Figure: (a) using P2 (x) to fit data. (b) using P3 (x) to fit data. (c) using P4 (x) to
fit data. (d) using P5 (x) to fit data
Professor Henry Arguello Numerical methods March 12, 2018 33 / 33