∗
Robust Statistics
                                               Sravan Danda
                                            November 29, 2024
   1. Show that if a value x0 is added to a dataset {x1 , · · · , xn } where −∞ < x0 < ∞ then the standard
      deviation of the modified dataset ranges from a value smaller than the standard deviation of the
      original dataset and ∞.
   2. Consider the situation of the former problem.
       (a) Show that if n is even, the maximum change in the sample median when x0 ranges from
           −∞ to ∞ is the distance from median of the original dataset to the next order statistic, the
           farthest from the median.
       (b) What is the maximum change when n is odd?
                                                                     log2
   3. Show that the median of the exponential distribution is             and hence log2 divided by sample
                                                                       λ
      median is a consistent estimator of λ.
      Solution:
                                                                                       ln 2
      To show that the median of the exponential distribution with rate parameter λ is      , we can
                                                                                        λ
      follow these steps:
      The cumulative distribution function (CDF) of an exponential distribution with rate parameter λ
      is:
                                             F (x) = 1 − e−λx
      By definition, the median m of the distribution satisfies F (m) = 0.5.
      So, we set F (m) = 0.5 and solve for m:
                                                   1 − e−λm = 0.5
      Rearranging, we get:
                                                     e−λm = 0.5
      Taking the natural logarithm of both sides:
                                                   −λm = ln(0.5)
      Recognizing that ln(0.5) = − ln(2), we have:
                                                           ln(2)
                                                     m=
                                                             λ
                                                                 ln(2)
      Thus, the population median of the exponential distribution is   .
                                                                   λ
      Since sample median mn converges to population median m as the number of samples increase to
      infinity, hence the result.
  ∗
    These are selected problems from Robust Statistics: Theory and Methods, Ricardo A. Maronna, R. Douglas Martin
and Victor J. Yohai, 2006, John Wiley and Sons.
                                                       1
4. Let F = (1 − )N (µ, 1) + N (µ, τ 2 ) then show that
    (a) Variance of the mean estimator is given by
                                                                (1 − ) + τ 2
                                              V ar(X̄) =                                           (1)
                                                                      n
   (b) Variance of the median estimator is given by
                                                                          π
                                                   X )) ≈
                                         V ar(M ed(X                                               (2)
                                                                    2n(1 −  + τ )2
5. Consider the family of student’s t distribution with v degrees of freedom. The density is given by
                                                                            − v+1
                                                Γ( v+1 )               x2
                                                                               2
                                      fv (x) = √ 2 v                1+                             (3)
                                                 vπΓ( 2 )              v
   This family contains all degrees of heavy-tailedness. When v → ∞, the distribution tends to
   standard Gaussian and for v = 1, we have the Cauchy distribution. Find the values of v for which
   the t distribution have finite moments of order k.
6. Show that if µ is a solution of
                                               n
                                               X
                                                     ψ(xi − µ̂) = 0                                (4)
                                               i=1
   then µ+c is a solution of the same equation with xi +c instead of xi . Here ψ = ρ0 where ρ = −logf0
   with f0 being the density of the probability distribution from which the samples are generated.
7. Show that if X = µ0 + U where the distribution of U is symmetric about 0 then µ0 is a solution of
                                              EF [ψ(X − µ0 )] = 0                                  (5)
8. Verify
                             EΦ [ψk (x)2 ] = 2[k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k)]                (6)
   where Φ and φ denote the cumulative distribution function and the density function of standard
   Gaussian respectively. ψk is the Huber’s function defined by
                                              (
                                                x         if |x| ≤ k
                                     ψk (x) =                                                 (7)
                                                sgn(x)k if |x| > k
   Solution:
   To prove
                            EΦ [ψk (x)2 ] = 2 k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k) ,
                                                                                
                                                                                                   (8)
   where Φ and φ denote the CDF and PDF of the standard Gaussian distribution, respectively, and
   ψk (x) is the Huber function defined by
                                              (
                                                x,       if |x| ≤ k,
                                     ψk (x) =                                                 (9)
                                                sgn(x)k, if |x| > k,
   we proceed as follows:
                                                        Z   ∞
                                      EΦ [ψk (x)2 ] =           ψk (x)2 φ(x) dx.
                                                          −∞
   Since ψk (x) behaves differently over |x| ≤ k and |x| > k, split the integral:
                                             Z k              Z
                             EΦ [ψk (x)2 ] =     x2 φ(x) dx +       k 2 φ(x) dx.
                                               −k                       |x|>k
                                                      2
  First Part (for |x| ≤ k)**:
                                            Z       k                             Z     k
                                                          2
                                                         x φ(x) dx = 2                      x2 φ(x) dx.
                                                    −k                              0
  Second Part (for |x| > k)**:
                         Z                                            Z       ∞
                                       2                          2
                                    k φ(x) dx = 2k                                φ(x) dx = 2k 2 (1 − Φ(k)).
                            |x|>k                                         k
                                       Z        k
  Using integration by parts for                    x2 φ(x) dx:
                                            0
                                   Z    k                                                     Z     k
                                                                                        k
                                            x2 φ(x) dx = [−xφ(x)]0 +                                    φ(x) dx.
                                    0                                                           0
                            2
                     e−k /2
  Substituting φ(k) = √     and simplifying:
                        2π
                             Z k
                                 x2 φ(x) dx = 2(Φ(k) − kφ(k) − 0.5).
                                       −k
  Substitute back to get:
                          EΦ [ψk (x)2 ] = 2 k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k) .
                                                                                Once can analyze the influence function of the Huber estimator using this result
  The influence function describes the effect of a small contamination at a point x on the estimator.
  For the Huber estimator, it is given by:
                                                                            ψk (x)
                                                          IF (x) =                      .
                                                                          EΦ [ψk (x)2 ]
  From the previous result, we know:
                          EΦ [ψk (x)2 ] = 2 k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k) .
                                                                                Thus,
                                                                        ψk (x)
                                IF (x) =                                                      .
                                                    2 [k 2 (1   − Φ(k)) + Φ(k) − 0.5 − kφ(k)]
  Evaluating IF (x) in Different Regions:
  - For |x| ≤ k:
                                                                        x
                                IF (x) =                                                    .
                                                    2 [k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k)]
  - For |x| > k:
                                                                       k sgn(x)
                                IF (x) =                                                      .
                                                    2 [k 2 (1   − Φ(k)) + Φ(k) − 0.5 − kφ(k)]
  This bounded influence function indicates that the Huber estimator is robust to outliers.
9. Show that if ψ is odd then the M-estimate µ̂ with fixed σ satisfies the following conditions:
     • If xi ≥ 0 for all i then µ̂ ≥ 0.
     • If xi = c for all i then µ̂ = c.
     • µ̂(−x) = −µ̂(x)
                                                                      3
10. Show that L-estimates are shift and scale equivariant and also satisfy
       • If xi ≥ 0 for all i then µ̂ ≥ 0.
       • If xi = c for all i then µ̂ = c.
       • µ̂(−x) = −µ̂(x)
11. Let [a, b], where a, b depend on the data be the shortest interval containing at least half of the data.
     (a) The Shorth (shortest half) location estimate is defined as the midpoint
                                                                 a+b
                                                          µ̂ =                                              (10)
                                                                  2
         Show that
                                            µ̂ = ArgM in[M ed1≤i≤n |xi − µ|]                                (11)
                                                    µ
     (b) Show that the difference b − a is a dispersion estimate
     (c) For a distribution F , let [a, b] be the shortest interval with probability 0.5. Find this interval
         for N (µ, σ 2 )
12. Let µ̂ be a location M-estimator. Show that if the distribution of xsi is symmetric about µ then so
    is the distribution of µ̂, and that the same happens with trimmed means.
13. Recall that Newton-Raphson procedure is a widely used iterative method for numerically solving
    non-linear equations. To solve for h(t) = 0, at each iteration h is linearized i.e. replaced by its
    Taylor expansion of order 1 about the current approximation. Thus, if at iteration m we have the
    approximation tm , then the next value tm+1 is the solution of
                                        h(tm ) + h0 (tm )(tm+1 − tm ) = 0                                   (12)
    Geometrically, at every current estimate we draw a tangent and the updated estimate is the t-
    coordinate where the tangent to the curve (t, h(t)) cuts the t-axis. In the context of location
    M-estimator, the update is given by
                                                  Pn
                                                        ψ(xi − µm )
                                    µm+1 = µm − Pni=1 0                                        (13)
                                                    i=1 ψ (xi − µm )
     (a) Argue that if the sequence {µm } converges then the limit is the solution to
                                                   n
                                                   X
                                                          ψ(xi − µ) = 0                                     (14)
                                                   i=1
     (b) Can you find an example of ψ where the sequence does not converge?
14. Verify that the breakdown points of Standard Deviation and Median absolute deviation about
                      1
    median are 0 and respectively.
                      2
15. Show that the asymptotic breakdown point of α-trimmed mean is α.
16. Show that the breaking point of equivariant dispersion estimates is ≤ 0.5.
17. Let the density f (x) be a decreasing function of |x|. Show that the shortest interval covering a
    given probability is symmetric about zero. Use this result to calculate the influence function of the
    Shorth estimate for data with distribution f .
18. For the exponential family given by
                                                            1 −x
                                               fθ (x) =       e θ I{x≥0}                                    (15)
                                                            θ
                                                                           M ed{xi }
    show that the estimate with smallest gross error sensitivity is                  . Find its efficiency w.r.t.
                                                                             log2
    MLE.