∗
Robust Statistics
Sravan Danda
November 29, 2024
1. Show that if a value x0 is added to a dataset {x1 , · · · , xn } where −∞ < x0 < ∞ then the standard
deviation of the modified dataset ranges from a value smaller than the standard deviation of the
original dataset and ∞.
2. Consider the situation of the former problem.
(a) Show that if n is even, the maximum change in the sample median when x0 ranges from
−∞ to ∞ is the distance from median of the original dataset to the next order statistic, the
farthest from the median.
(b) What is the maximum change when n is odd?
log2
3. Show that the median of the exponential distribution is and hence log2 divided by sample
λ
median is a consistent estimator of λ.
Solution:
ln 2
To show that the median of the exponential distribution with rate parameter λ is , we can
λ
follow these steps:
The cumulative distribution function (CDF) of an exponential distribution with rate parameter λ
is:
F (x) = 1 − e−λx
By definition, the median m of the distribution satisfies F (m) = 0.5.
So, we set F (m) = 0.5 and solve for m:
1 − e−λm = 0.5
Rearranging, we get:
e−λm = 0.5
Taking the natural logarithm of both sides:
−λm = ln(0.5)
Recognizing that ln(0.5) = − ln(2), we have:
ln(2)
m=
λ
ln(2)
Thus, the population median of the exponential distribution is .
λ
Since sample median mn converges to population median m as the number of samples increase to
infinity, hence the result.
∗
These are selected problems from Robust Statistics: Theory and Methods, Ricardo A. Maronna, R. Douglas Martin
and Victor J. Yohai, 2006, John Wiley and Sons.
1
4. Let F = (1 − )N (µ, 1) + N (µ, τ 2 ) then show that
(a) Variance of the mean estimator is given by
(1 − ) + τ 2
V ar(X̄) = (1)
n
(b) Variance of the median estimator is given by
π
X )) ≈
V ar(M ed(X (2)
2n(1 − + τ )2
5. Consider the family of student’s t distribution with v degrees of freedom. The density is given by
− v+1
Γ( v+1 ) x2
2
fv (x) = √ 2 v 1+ (3)
vπΓ( 2 ) v
This family contains all degrees of heavy-tailedness. When v → ∞, the distribution tends to
standard Gaussian and for v = 1, we have the Cauchy distribution. Find the values of v for which
the t distribution have finite moments of order k.
6. Show that if µ is a solution of
n
X
ψ(xi − µ̂) = 0 (4)
i=1
then µ+c is a solution of the same equation with xi +c instead of xi . Here ψ = ρ0 where ρ = −logf0
with f0 being the density of the probability distribution from which the samples are generated.
7. Show that if X = µ0 + U where the distribution of U is symmetric about 0 then µ0 is a solution of
EF [ψ(X − µ0 )] = 0 (5)
8. Verify
EΦ [ψk (x)2 ] = 2[k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k)] (6)
where Φ and φ denote the cumulative distribution function and the density function of standard
Gaussian respectively. ψk is the Huber’s function defined by
(
x if |x| ≤ k
ψk (x) = (7)
sgn(x)k if |x| > k
Solution:
To prove
EΦ [ψk (x)2 ] = 2 k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k) ,
(8)
where Φ and φ denote the CDF and PDF of the standard Gaussian distribution, respectively, and
ψk (x) is the Huber function defined by
(
x, if |x| ≤ k,
ψk (x) = (9)
sgn(x)k, if |x| > k,
we proceed as follows:
Z ∞
EΦ [ψk (x)2 ] = ψk (x)2 φ(x) dx.
−∞
Since ψk (x) behaves differently over |x| ≤ k and |x| > k, split the integral:
Z k Z
EΦ [ψk (x)2 ] = x2 φ(x) dx + k 2 φ(x) dx.
−k |x|>k
2
First Part (for |x| ≤ k)**:
Z k Z k
2
x φ(x) dx = 2 x2 φ(x) dx.
−k 0
Second Part (for |x| > k)**:
Z Z ∞
2 2
k φ(x) dx = 2k φ(x) dx = 2k 2 (1 − Φ(k)).
|x|>k k
Z k
Using integration by parts for x2 φ(x) dx:
0
Z k Z k
k
x2 φ(x) dx = [−xφ(x)]0 + φ(x) dx.
0 0
2
e−k /2
Substituting φ(k) = √ and simplifying:
2π
Z k
x2 φ(x) dx = 2(Φ(k) − kφ(k) − 0.5).
−k
Substitute back to get:
EΦ [ψk (x)2 ] = 2 k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k) .
Once can analyze the influence function of the Huber estimator using this result
The influence function describes the effect of a small contamination at a point x on the estimator.
For the Huber estimator, it is given by:
ψk (x)
IF (x) = .
EΦ [ψk (x)2 ]
From the previous result, we know:
EΦ [ψk (x)2 ] = 2 k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k) .
Thus,
ψk (x)
IF (x) = .
2 [k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k)]
Evaluating IF (x) in Different Regions:
- For |x| ≤ k:
x
IF (x) = .
2 [k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k)]
- For |x| > k:
k sgn(x)
IF (x) = .
2 [k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k)]
This bounded influence function indicates that the Huber estimator is robust to outliers.
9. Show that if ψ is odd then the M-estimate µ̂ with fixed σ satisfies the following conditions:
• If xi ≥ 0 for all i then µ̂ ≥ 0.
• If xi = c for all i then µ̂ = c.
• µ̂(−x) = −µ̂(x)
3
10. Show that L-estimates are shift and scale equivariant and also satisfy
• If xi ≥ 0 for all i then µ̂ ≥ 0.
• If xi = c for all i then µ̂ = c.
• µ̂(−x) = −µ̂(x)
11. Let [a, b], where a, b depend on the data be the shortest interval containing at least half of the data.
(a) The Shorth (shortest half) location estimate is defined as the midpoint
a+b
µ̂ = (10)
2
Show that
µ̂ = ArgM in[M ed1≤i≤n |xi − µ|] (11)
µ
(b) Show that the difference b − a is a dispersion estimate
(c) For a distribution F , let [a, b] be the shortest interval with probability 0.5. Find this interval
for N (µ, σ 2 )
12. Let µ̂ be a location M-estimator. Show that if the distribution of xsi is symmetric about µ then so
is the distribution of µ̂, and that the same happens with trimmed means.
13. Recall that Newton-Raphson procedure is a widely used iterative method for numerically solving
non-linear equations. To solve for h(t) = 0, at each iteration h is linearized i.e. replaced by its
Taylor expansion of order 1 about the current approximation. Thus, if at iteration m we have the
approximation tm , then the next value tm+1 is the solution of
h(tm ) + h0 (tm )(tm+1 − tm ) = 0 (12)
Geometrically, at every current estimate we draw a tangent and the updated estimate is the t-
coordinate where the tangent to the curve (t, h(t)) cuts the t-axis. In the context of location
M-estimator, the update is given by
Pn
ψ(xi − µm )
µm+1 = µm − Pni=1 0 (13)
i=1 ψ (xi − µm )
(a) Argue that if the sequence {µm } converges then the limit is the solution to
n
X
ψ(xi − µ) = 0 (14)
i=1
(b) Can you find an example of ψ where the sequence does not converge?
14. Verify that the breakdown points of Standard Deviation and Median absolute deviation about
1
median are 0 and respectively.
2
15. Show that the asymptotic breakdown point of α-trimmed mean is α.
16. Show that the breaking point of equivariant dispersion estimates is ≤ 0.5.
17. Let the density f (x) be a decreasing function of |x|. Show that the shortest interval covering a
given probability is symmetric about zero. Use this result to calculate the influence function of the
Shorth estimate for data with distribution f .
18. For the exponential family given by
1 −x
fθ (x) = e θ I{x≥0} (15)
θ
M ed{xi }
show that the estimate with smallest gross error sensitivity is . Find its efficiency w.r.t.
log2
MLE.