0% found this document useful (0 votes)

10 views4 pages

RobustStats Practice Problems

The document presents a series of problems and solutions related to robust statistics, including the effects of adding a value to a dataset on standard deviation and median, properties of the exponential distribution, and variance estimations for mean and median. It also discusses the influence function of the Huber estimator, conditions for M-estimates, and properties of L-estimates. Additionally, it covers topics such as the breakdown points of statistical measures and the asymptotic behavior of trimmed means.

Uploaded by

vrishti.godhwani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views4 pages

RobustStats Practice Problems

Uploaded by

vrishti.godhwani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

∗

Robust Statistics
Sravan Danda
November 29, 2024

1. Show that if a value x0 is added to a dataset {x1 , · · · , xn } where −∞ < x0 < ∞ then the standard
deviation of the modified dataset ranges from a value smaller than the standard deviation of the
original dataset and ∞.
2. Consider the situation of the former problem.
(a) Show that if n is even, the maximum change in the sample median when x0 ranges from
−∞ to ∞ is the distance from median of the original dataset to the next order statistic, the
farthest from the median.
(b) What is the maximum change when n is odd?
log2
3. Show that the median of the exponential distribution is and hence log2 divided by sample
λ
median is a consistent estimator of λ.
Solution:
ln 2
To show that the median of the exponential distribution with rate parameter λ is , we can
λ
follow these steps:
The cumulative distribution function (CDF) of an exponential distribution with rate parameter λ
is:
F (x) = 1 − e−λx

By definition, the median m of the distribution satisfies F (m) = 0.5.

So, we set F (m) = 0.5 and solve for m:

1 − e−λm = 0.5

Rearranging, we get:
e−λm = 0.5

Taking the natural logarithm of both sides:

−λm = ln(0.5)

Recognizing that ln(0.5) = − ln(2), we have:

ln(2)
m=
λ

ln(2)
Thus, the population median of the exponential distribution is .
λ
Since sample median mn converges to population median m as the number of samples increase to
infinity, hence the result.
∗
These are selected problems from Robust Statistics: Theory and Methods, Ricardo A. Maronna, R. Douglas Martin
and Victor J. Yohai, 2006, John Wiley and Sons.

1
4. Let F = (1 − )N (µ, 1) + N (µ, τ 2 ) then show that
(a) Variance of the mean estimator is given by

(1 − ) + τ 2
V ar(X̄) = (1)
n
(b) Variance of the median estimator is given by
π
X )) ≈
V ar(M ed(X (2)
2n(1 − + τ )2

5. Consider the family of student’s t distribution with v degrees of freedom. The density is given by
− v+1
Γ( v+1 ) x2
2

fv (x) = √ 2 v 1+ (3)
vπΓ( 2 ) v

This family contains all degrees of heavy-tailedness. When v → ∞, the distribution tends to
standard Gaussian and for v = 1, we have the Cauchy distribution. Find the values of v for which
the t distribution have finite moments of order k.
6. Show that if µ is a solution of
n
X
ψ(xi − µ̂) = 0 (4)
i=1

then µ+c is a solution of the same equation with xi +c instead of xi . Here ψ = ρ0 where ρ = −logf0
with f0 being the density of the probability distribution from which the samples are generated.
7. Show that if X = µ0 + U where the distribution of U is symmetric about 0 then µ0 is a solution of

EF [ψ(X − µ0 )] = 0 (5)

8. Verify
EΦ [ψk (x)2 ] = 2[k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k)] (6)
where Φ and φ denote the cumulative distribution function and the density function of standard
Gaussian respectively. ψk is the Huber’s function defined by
(
x if |x| ≤ k
ψk (x) = (7)
sgn(x)k if |x| > k

Solution:
To prove
EΦ [ψk (x)2 ] = 2 k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k) ,

(8)
where Φ and φ denote the CDF and PDF of the standard Gaussian distribution, respectively, and
ψk (x) is the Huber function defined by
(
x, if |x| ≤ k,
ψk (x) = (9)
sgn(x)k, if |x| > k,

we proceed as follows:
Z ∞
EΦ [ψk (x)2 ] = ψk (x)2 φ(x) dx.
−∞

Since ψk (x) behaves differently over |x| ≤ k and |x| > k, split the integral:
Z k Z
EΦ [ψk (x)2 ] = x2 φ(x) dx + k 2 φ(x) dx.
−k |x|>k

2
First Part (for |x| ≤ k)**:
Z k Z k
2
x φ(x) dx = 2 x2 φ(x) dx.
−k 0

Second Part (for |x| > k)**:

Z Z ∞
2 2
k φ(x) dx = 2k φ(x) dx = 2k 2 (1 − Φ(k)).
|x|>k k

Z k
Using integration by parts for x2 φ(x) dx:
0
Z k Z k
k
x2 φ(x) dx = [−xφ(x)]0 + φ(x) dx.
0 0

2
e−k /2
Substituting φ(k) = √ and simplifying:
2π
Z k
x2 φ(x) dx = 2(Φ(k) − kφ(k) − 0.5).
−k

Substitute back to get:

EΦ [ψk (x)2 ] = 2 k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k) .

Once can analyze the influence function of the Huber estimator using this result
The influence function describes the effect of a small contamination at a point x on the estimator.
For the Huber estimator, it is given by:

ψk (x)
IF (x) = .
EΦ [ψk (x)2 ]

From the previous result, we know:

EΦ [ψk (x)2 ] = 2 k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k) .

Thus,
ψk (x)
IF (x) = .
2 [k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k)]

Evaluating IF (x) in Different Regions:

- For |x| ≤ k:
x
IF (x) = .
2 [k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k)]

- For |x| > k:

k sgn(x)
IF (x) = .
2 [k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k)]

This bounded influence function indicates that the Huber estimator is robust to outliers.

9. Show that if ψ is odd then the M-estimate µ̂ with fixed σ satisfies the following conditions:
• If xi ≥ 0 for all i then µ̂ ≥ 0.
• If xi = c for all i then µ̂ = c.
• µ̂(−x) = −µ̂(x)

3
10. Show that L-estimates are shift and scale equivariant and also satisfy
• If xi ≥ 0 for all i then µ̂ ≥ 0.
• If xi = c for all i then µ̂ = c.
• µ̂(−x) = −µ̂(x)
11. Let [a, b], where a, b depend on the data be the shortest interval containing at least half of the data.
(a) The Shorth (shortest half) location estimate is defined as the midpoint
a+b
µ̂ = (10)
2
Show that
µ̂ = ArgM in[M ed1≤i≤n |xi − µ|] (11)
µ

(b) Show that the difference b − a is a dispersion estimate

(c) For a distribution F , let [a, b] be the shortest interval with probability 0.5. Find this interval
for N (µ, σ 2 )
12. Let µ̂ be a location M-estimator. Show that if the distribution of xsi is symmetric about µ then so
is the distribution of µ̂, and that the same happens with trimmed means.
13. Recall that Newton-Raphson procedure is a widely used iterative method for numerically solving
non-linear equations. To solve for h(t) = 0, at each iteration h is linearized i.e. replaced by its
Taylor expansion of order 1 about the current approximation. Thus, if at iteration m we have the
approximation tm , then the next value tm+1 is the solution of
h(tm ) + h0 (tm )(tm+1 − tm ) = 0 (12)
Geometrically, at every current estimate we draw a tangent and the updated estimate is the t-
coordinate where the tangent to the curve (t, h(t)) cuts the t-axis. In the context of location
M-estimator, the update is given by
Pn
ψ(xi − µm )
µm+1 = µm − Pni=1 0 (13)
i=1 ψ (xi − µm )
(a) Argue that if the sequence {µm } converges then the limit is the solution to
n
X
ψ(xi − µ) = 0 (14)
i=1

(b) Can you find an example of ψ where the sequence does not converge?
14. Verify that the breakdown points of Standard Deviation and Median absolute deviation about
1
median are 0 and respectively.
2
15. Show that the asymptotic breakdown point of α-trimmed mean is α.
16. Show that the breaking point of equivariant dispersion estimates is ≤ 0.5.
17. Let the density f (x) be a decreasing function of |x|. Show that the shortest interval covering a
given probability is symmetric about zero. Use this result to calculate the influence function of the
Shorth estimate for data with distribution f .
18. For the exponential family given by
1 −x
fθ (x) = e θ I{x≥0} (15)
θ
M ed{xi }
show that the estimate with smallest gross error sensitivity is . Find its efficiency w.r.t.
log2
MLE.

Econometrics Homework Solutions
No ratings yet
Econometrics Homework Solutions
11 pages
281A Final Sol
No ratings yet
281A Final Sol
9 pages
HW21
No ratings yet
HW21
2 pages
Exercises Solutions Based On Estimation
No ratings yet
Exercises Solutions Based On Estimation
9 pages
Quadratic Mean Differentiability Example
No ratings yet
Quadratic Mean Differentiability Example
5 pages
Final 1
No ratings yet
Final 1
20 pages
S3005 Asg1 Sol
No ratings yet
S3005 Asg1 Sol
7 pages
Assign20153 Sol
No ratings yet
Assign20153 Sol
47 pages
ECON 1630 Problem Set #2 Fall 2021: Bias Variance
No ratings yet
ECON 1630 Problem Set #2 Fall 2021: Bias Variance
9 pages
Spring 2009
No ratings yet
Spring 2009
4 pages
Statistics Resit July 16 2019+ (With+answers)
No ratings yet
Statistics Resit July 16 2019+ (With+answers)
11 pages
Unit 4
No ratings yet
Unit 4
8 pages
Solutions To Exam 1: 1 2 N N A N
No ratings yet
Solutions To Exam 1: 1 2 N N A N
3 pages
Final Exam Practice Problems
No ratings yet
Final Exam Practice Problems
8 pages
STAT732: Solutions For Homework 2: Due: Wednesday, Feb 14
No ratings yet
STAT732: Solutions For Homework 2: Due: Wednesday, Feb 14
7 pages
Problem Set 2 Solution
No ratings yet
Problem Set 2 Solution
10 pages
STA 303 Theory of Estimation 9th Lecture-1
No ratings yet
STA 303 Theory of Estimation 9th Lecture-1
7 pages
Cramer Raoh and Out 08
No ratings yet
Cramer Raoh and Out 08
13 pages
03 Spring Final Soln
No ratings yet
03 Spring Final Soln
3 pages
Notes For 18.6501x, Fundamentals of Statistics: v0.2 (2019 April 24)
100% (1)
Notes For 18.6501x, Fundamentals of Statistics: v0.2 (2019 April 24)
14 pages
Lecture 1
No ratings yet
Lecture 1
8 pages
X400004 20220215 Solutions
No ratings yet
X400004 20220215 Solutions
8 pages
Probability and Statistics Soln 20
No ratings yet
Probability and Statistics Soln 20
5 pages
Problem Set 1 Full Solutions
No ratings yet
Problem Set 1 Full Solutions
8 pages
Final Soln
No ratings yet
Final Soln
5 pages
Stat2006 A1
No ratings yet
Stat2006 A1
21 pages
B.A. H Economics Intermedi Bikup2y2023
No ratings yet
B.A. H Economics Intermedi Bikup2y2023
32 pages
Estimation Techniques in Radar and Statistical Analysis
No ratings yet
Estimation Techniques in Radar and Statistical Analysis
55 pages
STAT2602 Tutorial 5
No ratings yet
STAT2602 Tutorial 5
7 pages
Math 3423 Past Paper
No ratings yet
Math 3423 Past Paper
6 pages
Theory of Estimation
No ratings yet
Theory of Estimation
11 pages
MATH 376 - Final Exam Sample Solutions: 1 2 M 1 2 N I 1 2 1 I 2 2 2
No ratings yet
MATH 376 - Final Exam Sample Solutions: 1 2 M 1 2 N I 1 2 1 I 2 2 2
8 pages
Week 3-Nonparametric Estimation
No ratings yet
Week 3-Nonparametric Estimation
37 pages
2021-22 Exam
No ratings yet
2021-22 Exam
11 pages
Mathematical Statistics Tutorial
No ratings yet
Mathematical Statistics Tutorial
4 pages
Exam 2 Rev
No ratings yet
Exam 2 Rev
4 pages
RaoCramerans PDF
No ratings yet
RaoCramerans PDF
10 pages
Advanced Statistics Exam Prep
No ratings yet
Advanced Statistics Exam Prep
20 pages
Unit 3
No ratings yet
Unit 3
11 pages
Statistics for Advanced Students
No ratings yet
Statistics for Advanced Students
12 pages
Stats 231 / CS229T Homework 3 Solutions
No ratings yet
Stats 231 / CS229T Homework 3 Solutions
6 pages
Homework 2 So Lns
No ratings yet
Homework 2 So Lns
6 pages
Probability Theory and Stochastic Process Problems11
No ratings yet
Probability Theory and Stochastic Process Problems11
74 pages
Stat 210B HWK #5 Solutions: Garvesh Raskutti
No ratings yet
Stat 210B HWK #5 Solutions: Garvesh Raskutti
5 pages
ACTSC445 - Assignment 1 (Q2-Q6) : Ahad Shoaib - 20634235 October 15, 2018
No ratings yet
ACTSC445 - Assignment 1 (Q2-Q6) : Ahad Shoaib - 20634235 October 15, 2018
9 pages
MA204 FinalTest 2022
No ratings yet
MA204 FinalTest 2022
14 pages
Statistics Homework Solutions
No ratings yet
Statistics Homework Solutions
20 pages
Statistica
No ratings yet
Statistica
8 pages
Cheat Sheet For The Final Exam
No ratings yet
Cheat Sheet For The Final Exam
6 pages
Point and Interval Estimation
No ratings yet
Point and Interval Estimation
9 pages
Problem Set4
No ratings yet
Problem Set4
11 pages
PRML Solution Manual-2
No ratings yet
PRML Solution Manual-2
122 pages
Delta Method
No ratings yet
Delta Method
10 pages
Econ3120 SPR 14 Prelim1 Final Solution v2
No ratings yet
Econ3120 SPR 14 Prelim1 Final Solution v2
8 pages
Mathematical Statistics (II)
No ratings yet
Mathematical Statistics (II)
112 pages
Calculus Problems for Economists
No ratings yet
Calculus Problems for Economists
10 pages
Qualifying Exam in Probability and Statistics PDF
50% (2)
Qualifying Exam in Probability and Statistics PDF
11 pages
Robust Stats
No ratings yet
Robust Stats
63 pages
Boots Trapping
No ratings yet
Boots Trapping
22 pages
Bootstrap Tut Sol
No ratings yet
Bootstrap Tut Sol
1 page
Bootstrap Tut
No ratings yet
Bootstrap Tut
2 pages
Chapter 4... Saroj
No ratings yet
Chapter 4... Saroj
42 pages
Lec 0-Is Imaginary Part Really Imaginary
No ratings yet
Lec 0-Is Imaginary Part Really Imaginary
52 pages
For Printing-Chapter 2-Ambasa, Et Al.
No ratings yet
For Printing-Chapter 2-Ambasa, Et Al.
19 pages
Embry 7.2 Assignment
No ratings yet
Embry 7.2 Assignment
2 pages
Inferential Stats with Python Guide
No ratings yet
Inferential Stats with Python Guide
22 pages
Lecture 9 ReKm Statistical Inference Hypothesis Testing Parametric Test
No ratings yet
Lecture 9 ReKm Statistical Inference Hypothesis Testing Parametric Test
13 pages
Time Series Analysis Essentials
No ratings yet
Time Series Analysis Essentials
12 pages
Statistics Course Review Notes
No ratings yet
Statistics Course Review Notes
20 pages
Problem Set 10 (With Instructions) : Regression Analysis
No ratings yet
Problem Set 10 (With Instructions) : Regression Analysis
6 pages
Confidence Intervals vs. P-Values
No ratings yet
Confidence Intervals vs. P-Values
3 pages
(Ebook PDF) Statistical Reasoning in The Behavioral Sciences 7th Edition by Bruce M. Kinginstant Download
100% (4)
(Ebook PDF) Statistical Reasoning in The Behavioral Sciences 7th Edition by Bruce M. Kinginstant Download
58 pages
Handout - 7 - Threats To Validity
No ratings yet
Handout - 7 - Threats To Validity
1 page
Design Rainfall Data and Analysis
No ratings yet
Design Rainfall Data and Analysis
213 pages
Statistical Comparison of The Slopes of Two Regression Lines A Tutorial
No ratings yet
Statistical Comparison of The Slopes of Two Regression Lines A Tutorial
12 pages
Efektivitas Penerapan Sistem Operasi Berbasis Linux Ubuntu Hamzanwadi V.14 Untuk Meningkatkan Hasil Belajar Mahasiswa
No ratings yet
Efektivitas Penerapan Sistem Operasi Berbasis Linux Ubuntu Hamzanwadi V.14 Untuk Meningkatkan Hasil Belajar Mahasiswa
12 pages
10 F Test and Analysis of Variance ANOVA
No ratings yet
10 F Test and Analysis of Variance ANOVA
7 pages
Int 354 ML-1
No ratings yet
Int 354 ML-1
4 pages
Repaso Final - Estadistica, Spring 2022 - WebAssign
No ratings yet
Repaso Final - Estadistica, Spring 2022 - WebAssign
20 pages
INDR 372 Selected Solutions of Review Exercises For The Midterm Exam
No ratings yet
INDR 372 Selected Solutions of Review Exercises For The Midterm Exam
15 pages
Finance Assignment Analysis
100% (1)
Finance Assignment Analysis
9 pages
Mba103 Business Statistics Course Outline
No ratings yet
Mba103 Business Statistics Course Outline
4 pages
Statistical Methods for Researchers
No ratings yet
Statistical Methods for Researchers
3 pages
Child Abuse and Neglect Among Primary Schoolteachers
No ratings yet
Child Abuse and Neglect Among Primary Schoolteachers
5 pages
GMT 206 Numerical Data Analysis
No ratings yet
GMT 206 Numerical Data Analysis
18 pages
AG - Bayesian Calibration NS Model - 2025!01!03
No ratings yet
AG - Bayesian Calibration NS Model - 2025!01!03
16 pages
Exercise On T Test and Correlation Final
No ratings yet
Exercise On T Test and Correlation Final
10 pages
Unit 1
100% (3)
Unit 1
42 pages
Linear Regression: Overall Model Test Model R R F df1 df2 P
No ratings yet
Linear Regression: Overall Model Test Model R R F df1 df2 P
2 pages
Statistical Inference-1
No ratings yet
Statistical Inference-1
3 pages
Parameter Estimation of Bernoulli Distribution Using Maximum Likelihood and Bayesian Methods
No ratings yet
Parameter Estimation of Bernoulli Distribution Using Maximum Likelihood and Bayesian Methods
14 pages
TS-Moving Average - ACF and Stationarity
No ratings yet
TS-Moving Average - ACF and Stationarity
1 page
Ket Qua Eview Chuong 4 - 9
No ratings yet
Ket Qua Eview Chuong 4 - 9
12 pages

RobustStats Practice Problems

Uploaded by

RobustStats Practice Problems

Uploaded by

∗

By definition, the median m of the distribution satisfies F (m) = 0.5.

Taking the natural logarithm of both sides:

Recognizing that ln(0.5) = − ln(2), we have:

Second Part (for |x| > k)**:

Substitute back to get:

EΦ [ψk (x)2 ] = 2 k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k) .

From the previous result, we know:

EΦ [ψk (x)2 ] = 2 k 2 (1 − Φ(k)) + Φ(k) − 0.5 − kφ(k) .

Evaluating IF (x) in Different Regions:

- For |x| > k:

(b) Show that the difference b − a is a dispersion estimate

You might also like