Jim Lambers
MAT 772
Fall Semester 2010-11
Lecture 4 Notes
These notes correspond to Sections 1.5 and 1.6 in the text.
The Secant Method
One drawback of Newton’s method is that it is necessary to evaluate 𝑓 ′ (𝑥) at various points, which
may not be practical for some choices of 𝑓 . The secant method avoids this issue by using a finite
difference to approximate the derivative. As a result, 𝑓 (𝑥) is approximated by a secant line through
two points on the graph of 𝑓 , rather than a tangent line through one point on the graph.
Since a secant line is defined using two points on the graph of 𝑓 (𝑥), as opposed to a tangent
line that requires information at only one point on the graph, it is necessary to choose two initial
iterates 𝑥0 and 𝑥1 . Then, as in Newton’s method, the next iterate 𝑥2 is then obtained by computing
the 𝑥-value at which the secant line passing through the points (𝑥0 , 𝑓 (𝑥0 )) and (𝑥1 , 𝑓 (𝑥1 )) has a
𝑦-coordinate of zero. This yields the equation
𝑓 (𝑥1 ) − 𝑓 (𝑥0 )
(𝑥2 − 𝑥1 ) + 𝑓 (𝑥1 ) = 0
𝑥1 − 𝑥0
which has the solution
𝑓 (𝑥1 )(𝑥1 − 𝑥0 )
𝑥2 = 𝑥1 −
𝑓 (𝑥1 ) − 𝑓 (𝑥0 )
which can be rewritten as follows:
𝑓 (𝑥1 )(𝑥1 − 𝑥0 )
𝑥2 = 𝑥1 −
𝑓 (𝑥1 ) − 𝑓 (𝑥0 )
𝑓 (𝑥1 ) − 𝑓 (𝑥0 ) 𝑓 (𝑥1 )(𝑥1 − 𝑥0 )
= 𝑥1 −
𝑓 (𝑥1 ) − 𝑓 (𝑥0 ) 𝑓 (𝑥1 ) − 𝑓 (𝑥0 )
𝑥1 (𝑓 (𝑥1 ) − 𝑓 (𝑥0 )) − 𝑓 (𝑥1 )(𝑥1 − 𝑥0 )
=
𝑓 (𝑥1 ) − 𝑓 (𝑥0 )
𝑥1 𝑓 (𝑥1 ) − 𝑥1 𝑓 (𝑥0 ) − 𝑥1 𝑓 (𝑥1 ) + 𝑥0 𝑓 (𝑥1 )
=
𝑓 (𝑥1 ) − 𝑓 (𝑥0 )
𝑥0 𝑓 (𝑥1 ) − 𝑥1 𝑓 (𝑥0 )
= .
𝑓 (𝑥1 ) − 𝑓 (𝑥0 )
This leads to the following algorithm.
Algorithm (Secant Method) Let 𝑓 : ℝ → ℝ be a continuous function. The following algorithm
computes an approximate solution 𝑥∗ to the equation 𝑓 (𝑥) = 0.
1
Choose two initial guesses 𝑥0 and 𝑥1
for 𝑘 = 1, 2, 3, . . . do
if 𝑓 (𝑥𝑘 ) is sufficiently small then
𝑥∗ = 𝑥𝑘
return 𝑥∗
end
𝑥 𝑓 (𝑥𝑘 )−𝑥𝑘 𝑓 (𝑥𝑘−1 )
𝑥𝑘+1 = 𝑘−1𝑓 (𝑥𝑘 )−𝑓 (𝑥𝑘−1 )
if ∣𝑥𝑘+1 − 𝑥𝑘 ∣ is sufficiently small then
𝑥∗ = 𝑥𝑘+1
return 𝑥∗
end
end
Like Newton’s method, it is necessary to choose the starting iterate 𝑥0 to be reasonably close
to the solution 𝑥∗ . Convergence is not as rapid as that of Newton’s Method, since the secant-line
approximation of 𝑓 is not as accurate as the tangent-line approximation employed by Newton’s
method.
Example We will use the Secant Method to solve the equation 𝑓 (𝑥) = 0, where 𝑓 (𝑥) = 𝑥2 − 2.
This method requires that we choose two initial iterates 𝑥0 and 𝑥1 , and then compute subsequent
iterates using the formula
𝑓 (𝑥𝑛 )(𝑥𝑛 − 𝑥𝑛−1 )
𝑥𝑛+1 = 𝑥𝑛 − , 𝑛 = 1, 2, 3, . . . .
𝑓 (𝑥𝑛 ) − 𝑓 (𝑥𝑛−1 )
We choose 𝑥0 = 1 and 𝑥1 = 1.5. Applying the above formula, we obtain
𝑥2 = 1.4
𝑥3 = 1.41379310344828
𝑥4 = 1.41421568627451.
As we√can see, the iterates produced by the Secant Method are converging to the exact solution
𝑥∗ = 2, but not as rapidly as those produced by Newton’s Method. □
We now prove that the Secant Method converges if 𝑥0 is chosen sufficiently close to a solution
𝑥∗ of 𝑓 (𝑥) = 0, if 𝑓 is continuously differentiable near 𝑥∗ and 𝑓 ′ (𝑥∗ ) = 𝛼 ∕= 0. Without loss of
generality, we assume 𝛼 > 0. Then, by the continuity of 𝑓 ′ , there exists an interval 𝐼𝛿 = [𝑥∗ −𝛿, 𝑥∗ +𝛿]
such that
3𝛼 5𝛼
≤ 𝑓 ′ (𝑥) ≤ , 𝑥 ∈ 𝐼𝛿 .
4 4
It follows from the Mean Value Theorem that
𝑥𝑘 − 𝑥𝑘−1
𝑥𝑘+1 − 𝑥∗ = 𝑥𝑘 − 𝑥∗ − 𝑓 (𝑥𝑘 )
𝑓 (𝑥𝑘 ) − 𝑓 (𝑥𝑘−1 )
2
𝑓 ′ (𝜃𝑘 )(𝑥𝑘 − 𝑥∗ )
= 𝑥𝑘 − 𝑥∗ −
𝑓 ′ (𝜑𝑘 )
′
[ ]
𝑓 (𝜃𝑘 )
= 1− ′ (𝑥𝑘 − 𝑥∗ ),
𝑓 (𝜑𝑘 )
where 𝜃𝑘 lies between 𝑥𝑘 and 𝑥∗ , and 𝜑𝑘 lies between 𝑥𝑘 and 𝑥𝑘−1 . Therefore, if 𝑥𝑘−1 and 𝑥𝑘 are
in 𝐼𝛿 , then so are 𝜑𝑘 and 𝜃𝑘 , and 𝑥𝑘+1 satisfies
{ }
∗ 5𝛼/4 3𝛼/4 2
∣𝑥𝑘 − 𝑥∗ ∣ ≤ ∣𝑥𝑘 − 𝑥∗ ∣.
∣𝑥𝑘+1 − 𝑥 ∣ ≤ max 1 −
, 1 −
3𝛼/4 5𝛼/4 3
We conclude that if 𝑥0 , 𝑥1 ∈ 𝐼𝛿 , then all subsequent iterates lie in 𝐼𝛿 , and the Secant Method
converges at least linearly, with asymptotic rate constant 2/3.
The order of convergence of the Secant Method can be determined using a result, which we will
not prove here, stating that if {𝑥𝑘 }∞𝑘=0 is the sequence of iterates produced by the Secant Method
for solving 𝑓 (𝑥) = 0, and if this sequence converges to a solution 𝑥∗ , then for 𝑘 sufficiently large,
∣𝑥𝑘+1 − 𝑥∗ ∣ ≈ 𝑆∣𝑥𝑘 − 𝑥∗ ∣∣𝑥𝑘−1 − 𝑥∗ ∣
for some constant 𝑆.
We assume that {𝑥𝑘 } converges to 𝑥∗ of order 𝛼. Then, dividing both sides of the above relation
by ∣𝑥𝑘 − 𝑥∗ ∣𝛼 , we obtain
∣𝑥𝑘+1 − 𝑥∗ ∣
≈ 𝑆∣𝑥𝑘 − 𝑥∗ ∣1−𝛼 ∣𝑥𝑘−1 − 𝑥∗ ∣.
∣𝑥𝑘 − 𝑥∗ ∣𝛼
Because 𝛼 is the rate of convergence, the left side must converge to a positive constant 𝐶 as 𝑘 → ∞.
It follows that the right side must converge to a positive constant as well, as must its reciprocal.
In other words, there must exist positive constants 𝐶1 and 𝐶2
∣𝑥𝑘 − 𝑥∗ ∣ ∣𝑥𝑘 − 𝑥∗ ∣𝛼−1
→ 𝐶1 , → 𝐶2 .
∣𝑥𝑘−1 − 𝑥∗ ∣𝛼 ∣𝑥𝑘−1 − 𝑥∗ ∣
This can only be the case if there exists a nonzero constant 𝛽 such that
)𝛽
∣𝑥𝑘 − 𝑥∗ ∣ ∣𝑥𝑘 − 𝑥∗ ∣𝛼−1
(
= ,
∣𝑥𝑘−1 − 𝑥∗ ∣𝛼 ∣𝑥𝑘−1 − 𝑥∗ ∣
which implies that
1 = (𝛼 − 1)𝛽 and 𝛼 = 𝛽.
Eliminating 𝛽, we obtain the equation
𝛼2 − 𝛼 − 1 = 0,
which has the solutions
√ √
1+ 5 1− 5
𝛼1 = ≈ 1.618, 𝛼2 = ≈ −0.618.
2 2
Since we must have 𝛼 > 1, the rate of convergence is 1.618.
3
The Bisection Method
Suppose that 𝑓 (𝑥) is a continuous function that changes sign on the interval [𝑎, 𝑏]. Then, by the
Intermediate Value Theorem, 𝑓 (𝑥) = 0 for some 𝑥 ∈ [𝑎, 𝑏]. How can we find the solution, knowing
that it lies in this interval?
The method of bisection attempts to reduce the size of the interval in which a solution is known
to exist. Suppose that we evaluate 𝑓 (𝑚), where 𝑚 = (𝑎 + 𝑏)/2. If 𝑓 (𝑚) = 0, then we are done.
Otherwise, 𝑓 must change sign on the interval [𝑎, 𝑚] or [𝑚, 𝑏], since 𝑓 (𝑎) and 𝑓 (𝑏) have different
signs. Therefore, we can cut the size of our search space in half, and continue this process until the
interval of interest is sufficiently small, in which case we must be close to a solution. The following
algorithm implements this approach.
Algorithm (Bisection) Let 𝑓 be a continuous function on the interval [𝑎, 𝑏] that changes sign on
(𝑎, 𝑏). The following algorithm computes an approximation 𝑝∗ to a number 𝑝 in (𝑎, 𝑏) such that
𝑓 (𝑝) = 0.
for 𝑗 = 1, 2, . . . do
𝑝𝑗 = (𝑎 + 𝑏)/2
if 𝑓 (𝑝𝑗 ) = 0 or 𝑏 − 𝑎 is sufficiently small then
𝑝∗ = 𝑝𝑗
return 𝑝∗
end
if 𝑓 (𝑎)𝑓 (𝑝𝑗 ) < 0 then
𝑏 = 𝑝𝑗
else
𝑎 = 𝑝𝑗
end
end
At the beginning, it is known that (𝑎, 𝑏) contains a solution. During each iteration, this algo-
rithm updates the interval (𝑎, 𝑏) by checking whether 𝑓 changes sign in the first half (𝑎, 𝑝𝑗 ), or in
the second half (𝑝𝑗 , 𝑏). Once the correct half is found, the interval (𝑎, 𝑏) is set equal to that half.
Therefore, at the beginning of each iteration, it is known that the current interval (𝑎, 𝑏) contains a
solution.
The test 𝑓 (𝑎)𝑓 (𝑝𝑗 ) < 0 is used to determine whether 𝑓 changes sign in the interval (𝑎, 𝑝𝑗 ) or
(𝑝𝑗 , 𝑏). This test is more efficient than checking whether 𝑓 (𝑎) is positive and 𝑓 (𝑝𝑗 ) is negative, or
vice versa, since we do not care which value is positive and which is negative. We only care whether
they have different signs, and if they do, then their product must be negative.
In comparison to other methods, including some that we will discuss, bisection tends to converge
rather slowly, but it is also guaranteed to converge. These qualities can be seen in the following
result concerning the accuracy of bisection.
4
Theorem Let 𝑓 be continuous on [𝑎, 𝑏], and assume that 𝑓 (𝑎)𝑓 (𝑏) < 0. For each positive integer
𝑛, let 𝑝𝑛 be the 𝑛th iterate that is produced by the bisection algorithm. Then the sequence {𝑝𝑛 }∞
𝑛=1
converges to a number 𝑝 in (𝑎, 𝑏) such that 𝑓 (𝑝) = 0, and each iterate 𝑝𝑛 satisfies
𝑏−𝑎
∣𝑝𝑛 − 𝑝∣ ≤ .
2𝑛
It should be noted that because the 𝑛th iterate can lie anywhere within the interval (𝑎, 𝑏) that is
used during the 𝑛th iteration, it is possible that the error bound given by this theorem may be
quite conservative.
Example We seek a solution of the equation 𝑓 (𝑥) = 0, where
𝑓 (𝑥) = 𝑥2 − 𝑥 − 1.
Because 𝑓 (1) = −1 and 𝑓 (2) = 1, and 𝑓 is continuous, we can use the Intermediate Value Theorem
to conclude that 𝑓 (𝑥) = 0 has a solution in the interval (1, 2), since 𝑓 (𝑥) must assume every value
between −1 and 1 in this interval.
We use the method of bisection to find a solution. First, we compute the midpoint of the
interval, which is (1 + 2)/2 = 1.5. Since 𝑓 (1.5) = −0.25, we see that 𝑓 (𝑥) changes sign between
𝑥 = 1.5 and 𝑥 = 2, so we can apply the Intermediate Value Theorem again to conclude that
𝑓 (𝑥) = 0 has a solution in the interval (1.5, 2).
Continuing this process, we compute the midpoint of the interval (1.5, 2), which is (1.5 + 2)/2 =
1.75. Since 𝑓 (1.75) = 0.3125, we see that 𝑓 (𝑥) changes sign between 𝑥 = 1.5 and 𝑥 = 1.75, so we
conclude that there is a solution in the interval (1.5, 1.75). The following table shows the outcome
of several more iterations of this procedure. Each row shows the current interval (𝑎, 𝑏) in which
we know that a solution exists, as well as the midpoint of the interval, given by (𝑎 + 𝑏)/2, and the
value of 𝑓 at the midpoint. Note that from iteration to iteration, only one of 𝑎 or 𝑏 changes, and
the endpoint that changes is always set equal to the midpoint.
𝑎 𝑏 𝑚 = (𝑎 + 𝑏)/2 𝑓 (𝑚)
1 2 1.5 −0.25
1.5 2 1.75 0.3125
1.5 1.75 1.625 0.015625
1.5 1.625 1.5625 −0.12109
1.5625 1.625 1.59375 −0.053711
1.59375 1.625 1.609375 −0.019287
1.609375 1.625 1.6171875 −0.0018921
1.6171875 1.625 1.62109325 0.0068512
1.6171875 1.62109325 1.619140625 0.0024757
1.6171875 1.619140625 1.6181640625 0.00029087
5
The correct solution, to ten decimal places, is 1.6180339887, which is the number known as the
golden ratio. □
For this method, it is easier to determine the order of convergence if we use a different measure
of the error in each iterate 𝑥𝑘 . Since each iterate is contained within an interval [𝑎𝑘 , 𝑏𝑘 ] where
𝑏𝑘 − 𝑎𝑘 = 2−𝑘 (𝑏 − 𝑎), with [𝑎, 𝑏] being the original interval, it follows that we can bound the error
𝑥𝑘 − 𝑥∗ by 𝑒𝑘 = 𝑏𝑘 − 𝑎𝑘 . Using this measure, we can easily conclude that bisection converges
linearly, with asymptotic error constant 1/2.
Safeguarded Methods
It is natural to ask whether it is possible to combine the rapid convergence of methods such as
Newton’s method with “safe” methods such as bisection that are guaranteed to converge. This
leads to the concept of safeguarded methods, which maintain an interval within which a solution
is known to exist, as in bisection, but use a method such as Newton’s method to find a solution
within that interval. If an iterate falls outside this interval, the safe procedure is used to refine the
interval before trying the rapid method.
An example of a safeguarded method is the method of Regula Falsi, which is also known as the
method of false position. It is a modification of the secant method in which the two initial iterates
𝑥0 and 𝑥1 are chosen so that 𝑓 (𝑥0 ) ⋅ 𝑓 (𝑥1 ) < 0, thus guaranteeing that a solution lies between 𝑥0
and 𝑥1 . This condition also guarantees that the next iterate 𝑥2 will lie between 𝑥0 and 𝑥1 , as can
be seen by applying the Intermediate Value Theorem to the secant line passing through (𝑥0 , 𝑓 (𝑥0 ))
and (𝑥1 , 𝑓 (𝑥1 )).
It follows that if 𝑓 (𝑥2 ) ∕= 0, then a solution must lie between 𝑥0 and 𝑥2 , or between 𝑥1 and
𝑥2 . In the first scenario, we use the secant line passing through (𝑥0 , 𝑓 (𝑥0 )) and (𝑥2 , 𝑓 (𝑥2 )) to
compute the next iterate 𝑥3 . Otherwise, we use the secant line passing through (𝑥1 , 𝑓 (𝑥1 )) and
(𝑥2 , 𝑓 (𝑥2 )). Continuing in this fashion, we obtain a sequence of smaller and smaller intervals that
are guaranteed to contain a solution, as in bisection, but interval is updated using a superlinearly
convergent method, the secant method, rather than simply being bisected.
Algorithm (Method of Regula Falsi) Let 𝑓 : ℝ → ℝ be a continuous function that changes sign on
the interval (𝑎, 𝑏). The following algorithm computes an approximate solution 𝑥∗ to the equation
𝑓 (𝑥) = 0.
repeat
𝑐 = 𝑎𝑓𝑓 (𝑏)−𝑓
(𝑏)−𝑏𝑓 (𝑎)
(𝑎)
if 𝑓 (𝑐) = 0 or 𝑏 − 𝑎 is sufficiently small then
𝑥∗ = 𝑐
return 𝑥∗
end
6
if 𝑓 (𝑎) ⋅ 𝑓 (𝑐) < 0 then
𝑏=𝑐
else
𝑎=𝑐
end
end
Example We use the Method of Regula Falsi (False Position) to solve 𝑓 (𝑥) = 0 where 𝑓 (𝑥) = 𝑥2 −2.
First, we must choose two initial guesses 𝑥0 and 𝑥1 such that 𝑓 (𝑥) changes sign between 𝑥0 and
𝑥1 . Choosing 𝑥0 = 1 and 𝑥1 = 1.5, we see that 𝑓 (𝑥0 ) = 𝑓 (1) = −1 and 𝑓 (𝑥1 ) = 𝑓 (1.5) = 0.25, so
these choices are suitable.
Next, we use the Secant Method to compute the next iterate 𝑥2 by determining the point at
which the secant line passing through the points (𝑥0 , 𝑓 (𝑥0 )) and (𝑥1 , 𝑓 (𝑥1 )) intersects the line 𝑦 = 0.
We have
𝑓 (𝑥0 )(𝑥1 − 𝑥0 )
𝑥2 = 𝑥0 −
𝑓 (𝑥1 ) − 𝑓 (𝑥0 )
(−1)(1.5 − 1)
= 1−
0.25 − (−1)
1.5 − 1
= 1+
0.25 + 1
0.5
= 1+
1.25
= 1.4.
Computing 𝑓 (𝑥2 ), we obtain 𝑓 (1.4) = −0.04 < 0. Since 𝑓 (𝑥2 ) < 0 and 𝑓 (𝑥1 ) > 0, we can use the
Intermediate Value Theorem to conclude that a solution exists in the interval (𝑥2 , 𝑥1 ). Therefore,
we compute 𝑥3 by determining where the secant line through the points (𝑥1 , 𝑓 (𝑥1 )) and 𝑓 (𝑥2 , 𝑓 (𝑥2 ))
intersects the line 𝑦 = 0. Using the formula for the Secant Method, we obtain
𝑓 (𝑥1 )(𝑥2 − 𝑥1 )
𝑥3 = 𝑥1 −
𝑓 (𝑥2 ) − 𝑓 (𝑥1 )
(0.25)(1.4 − 1.5)
= 1.5 −
−0.04 − 0.25
= 1.41379.
Since 𝑓 (𝑥3 ) < 0 and 𝑓 (𝑥2 ) < 0, we do not know that a solution exists in the interval (𝑥2 , 𝑥3 ).
However, we do know that a solution exists in the interval (𝑥3 , 𝑥1 ), because 𝑓 (𝑥1 ) > 0. Therefore,
instead of proceeding as in the Secant Method and using the Secant line determined by 𝑥2 and 𝑥3
to compute 𝑥4 , we use the secant line determined by 𝑥1 and 𝑥3 to compute 𝑥4 . □