Recursive Least-Squares
(RLS)
Adaptive Filters
RLS ELE 774 - Adaptive Signal Processing 1
Definition
With the arrival of new data samples estimates are updated
recursively.
Introduce a weighting factor to the sum-of-error-squares definition
two time-indices
n: outer, i: inner
Weighting factor
Forgetting factor
: real, positive, <1, →1 w(n) is kept fixed during the
=1 → ordinary LS observation interval 1≤i ≤n for
which the cost function
1/(1- ): memory of the algorithm
(ordinary LS has infinite memory)
(n) is defined.
RLS ELE 774 - Adaptive Signal Processing 2
Definition
RLS ELE 774 - Adaptive Signal Processing 3
Regularisation
LS cost function can be ill-posed
There is insufficient information in the input data to reconstruct
the input-output mapping uniquely
Uncertainty in the mapping due to measurement noise.
To overcome the problem, take ‘prior information’ into account
Regularisation term
Prewindowing is assumed! Smooths and stabilises the solution
(not the covariance method) : regularisation parameter
RLS ELE 774 - Adaptive Signal Processing 4
Normal Equations
From method of least-squares we know that
then the time-average autocorrelation matrix of the input u(n) becomes
autocorrelation matrix
is always non-singular
due to this term.
Similarly, the time-average cross-correlation vector between the tap
inputs and the desired response is (unaffected from regularisation)
Hence, the optimum (in the LS sense) filter coefficients should satisfy
(-1 always exists!)
RLS ELE 774 - Adaptive Signal Processing 5
Recursive Computation
Isolate the last term for i=n:
Similarly
We need to calculate -1 to find w → direct calculation can be costly!
Use Matrix Inversion Lemma (MIL)
RLS ELE 774 - Adaptive Signal Processing 6
Recursive Least-Squares Algorithm
Let
Then, using MIL
Now, letting
inverse correlation gain vector
matrix
We obtain
Riccati
equation
RLS ELE 774 - Adaptive Signal Processing 7
Recursive Least-Squares Algorithm
Rearranging
How can w be calculated recursively? Let
After substituting the recursion for P(n) into the first term we obtain
But P(n)u(n)=k(n), hence
RLS ELE 774 - Adaptive Signal Processing 8
Recursive Least-Squares Algorithm
The term
is called the a priori estimation error,
Whereas the term
is called the a posteriori estimation error. (Why?)
Summary; the update eqn. gain vector
a priori error
-1 is calculated recursively and with scalar division
regularisation
Initialisation: (n=0) parameter
If no a priori information exists
RLS ELE 774 - Adaptive Signal Processing 9
Recursive Least-Squares Algorithm
RLS ELE 774 - Adaptive Signal Processing 10
Recursive Least-Squares Algorithm
RLS ELE 774 - Adaptive Signal Processing 11
Recursion for the Sum-of-Weighted-Error-Squares
From LS, we know that
where
Then
Hence
RLS ELE 774 - Adaptive Signal Processing 12
Convergence Analysis
Assume stationary environment and =1
To avoid transitions, consider times n>M
Assumption I: The desired response d(n) and
the tap-input vector u(n) are related by the
linear regression model
where wo is the regression parameter vector
and eo(n) is the measurement noise. The
noise eo(n) is white with zero mean and
variance o2 which makes it independent of
the regressor u(n).
RLS ELE 774 - Adaptive Signal Processing 13
Convergence Analysis
Assumption II: The input vector u(n) is drawn from a stochastic
process, which is ergodic in the autocorrelation function.
R: ensemble average, : time average autocorrelation matrices
Assumption III: The fluctuations in the weight-error vector (n) are
slow compared with those of the input signal vector u(n).
Justification:
(n) is an accumulation of the a priori error → hence, the input
→Smoothing (low-pass filtering) effect.
Consequence:
RLS ELE 774 - Adaptive Signal Processing 14
Convergence in Mean Value
=1
Then,
Substituting into w(n) and taking the expectation, we get
Applying Assumptions I and II, above expression simplifies to
biased estimate due to the initialization, but bias →0 as n→∞.
RLS ELE 774 - Adaptive Signal Processing 15
Mean-Square Deviation
Weight-error correlation matrix
and invoking Assumption I and simplifying we obtain
Then
But, mean-square-deviation is
RLS ELE 774 - Adaptive Signal Processing 16
Mean-Square Deviation
Observations:
Mean-Square Deviation D(n)
is proportional to the sum of reciprocal of eigenvalues of R
The sensitivity of the RLS algorithm to eigenvalue spread is
determined by the reciprocal of the smallest eigenvalue.
ill-conditioned LS problems may lead to poor convergence behaviour.
decays almost linearly with the number of iterations
^
w(n) converges to the Wiener solution wo as n grows.
RLS ELE 774 - Adaptive Signal Processing 17
Ensemble-Average Learning Curve
There are two error terms
A priori error,
A posteriori error,
Learning curve considering (n) yields the same general shape as
that for the LMS algorithm.
Both RLS and LMS learning curves can be compared with this choice.
The learning curve for RLS (a posteriori error) is
We know that
RLS ELE 774 - Adaptive Signal Processing 18
Ensemble-Average Learning Curve
Substitution yields
1st term (Assumption I)
2nd term (Assumption III)
3 & 4th terms (Assumption I)
RLS ELE 774 - Adaptive Signal Processing 19
Ensemble-Average Learning Curve
Combining all terms
Observations
The ensemble-average learning curve of the RLS algorithm
converges in about 2M iterations
Typically an order of magnitude faster than LMS
As the number of iterations n→∞ the MSE J’(n) approaches the
final value σo2 which is the variance of the measur. error eo(n).
in theory RLS produces zero excess MSE!.
Convergence of the RLS algorithm in the mean square is
independent of the eigenvalues of the ensemble-average
correlation matrix R of the input vector u(n).
RLS ELE 774 - Adaptive Signal Processing 20
Ensemble-Average Learning Curve
RLS ELE 774 - Adaptive Signal Processing 21