KEMBAR78
Kolmogorov-Smirnov Distance | PDF | Mathematical Analysis | Mathematical Objects
0% found this document useful (0 votes)
154 views3 pages

Kolmogorov-Smirnov Distance

This document discusses several statistical distances and properties related to convergence in distribution. It introduces the Kolmogorov-Smirnov distance for probability measures on the real line. It also discusses total variation distance and Wasserstein distance. It states that all three distances are stronger notions of convergence than weak convergence. The document also presents a lemma relating Kolmogorov distance and Wasserstein distance when one random variable has a bounded density. Finally, it introduces Stein's lemma, which relates the expectation of a random variable Z multiplied by a function f(Z) to the expectation of the derivative of f(Z) when Z is standard normal.

Uploaded by

Costanzo Manes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
154 views3 pages

Kolmogorov-Smirnov Distance

This document discusses several statistical distances and properties related to convergence in distribution. It introduces the Kolmogorov-Smirnov distance for probability measures on the real line. It also discusses total variation distance and Wasserstein distance. It states that all three distances are stronger notions of convergence than weak convergence. The document also presents a lemma relating Kolmogorov distance and Wasserstein distance when one random variable has a bounded density. Finally, it introduces Stein's lemma, which relates the expectation of a random variable Z multiplied by a function f(Z) to the expectation of the derivative of f(Z) when Z is standard normal.

Uploaded by

Costanzo Manes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

1.

3 Kolmogorov-Smirnov distance

Only for probability measures on R.

Kolm(µ, ν) := sup |µ ((−∞, x]) − ν ((−∞, x])|


x∈R
≤ TV(µ, ν).

1.4 Facts

• All three distances defined above are stronger than weak convergence (i.e. convergence
in distribution, which is weak* convergence on the space of probaility measures, seen
as a dual space). That is, if any of these metrics go to zero as n → ∞, then we have
weak convergence. But converse is not true. However, weak convergence is metrizable
(e.g. by the Prokhorov metric).

• Important coupling interpretation of total variation distance:

T V (µ, ν) = inf {P (X 6= Y ) : (X, Y ) is a r.v. s.t. X ∼ µ, Y ∼ ν}

(i.e. infimum over all joint distributions with given marginals.)

• Similarly, for µ, ν on the real line,

Wass(µ, ν) = inf {E |X − Y | : (X, Y ) is a r.v. s.t. X ∼ µ, Y ∼ ν}

(So it’s often called the Wass1 , because of L1 norm.)

• TV isP
a very strong notion, often too strong to be useful. Suppose X1 , X2 , . . . iid ±1.
Sn = n1 Xi . Then
S
√n =⇒ N (0, 1)
n
Sn
But T V ( √ , Z) = 1 for all n. Both Wasserstein and Kolmogorov distances go to 0 at
√ n
rate 1/ n.

Lemma 1 Suppose W, Z are two r.v.’s and


p Z has a density w.r.t. Lebesgue measure bounded
by a constant C. Then Kolm(W, Z) ≤ 2 CWass(W, Z).

Proof: Consider a point t, and fix an . Define two functions g1 and g2 as follows. Let
g1 (x) = 1 on (−∞, t), 0 on [t + , ∞) and linear interpolation in between. Let g2 (x) = 1 on
(−∞, t − ], 0 on [t, ∞), and linear interpolation in between. Then g1 and g2 form upper
and lower ‘envelopes’ for 1(−∞,t] . So

P (W ≤ t) − P (Z ≤ t) ≤ E g1 (W ) − E g1 (Z) + E g1 (Z) − P (Z ≤ T ).

2-2
Now E g1 (W ) − E g1 (Z) ≤ 1 Wass(W, Z) since g1 is (1/)-Lipschitz, and E g1 (Z) − P (Z ≤
t) ≤ C since Z has density bdd by C.

Now using g2 , same bound holds for the other side: P (Z ≤ t) − P (W ≤ t). Optimize over
 to get the required bound. 2

1.5 A stronger notion of distance


Pn
Exercise 1: Sn a simple random walk (SRW). Sn = 1 Xi , with Xi iid ±1. Then

S
√n =⇒ Z ∼ N (0, 1).
n

The Berry-Esseen bound: Suppose X1 , X2 , . . . iid E(X1 ) = 0, E(X12 ) = 1, E |X|3 < ∞.


Then
3 E |X1 |3
 
Sn
Kolm √ , Z ≤ √
n n

Can also show that for SRW,


 
S Const
Wass √n , Z ≤ √
n n
Sn
This means that it is possible to construct √
n
and Z on the same space such that

Sn C
E √ − Z ≤ √
n n

Can we do it in the strong sense? That is:


 
Sn t
≤ Ce−ct .

P √ − Z > √

n n

This is known as Tusnády’s Lemma. Will come back to this later.

2 Integration by parts for the gaussian measure

The following result is sometimes called ‘Stein’s Lemma’.

Lemma 2 If Z ∼ N (0, 1), and f : R → R is an absolutely continuous function such that


E |f 0 (Z)| < ∞, then E Zf (Z) = E f 0 (Z).

2-3
Proof: First assume f has compact support contained in (a, b). Then the result follows
from integration by parts:
Z b h ib Z b
−x2 /2 −x2 /2 2 /2
xf (x)e dx = f (x)e + f 0 (x)e−x dx.
a a a

Now take any f s.t. E |Zf (Z)| < ∞, E |f 0 (Z)| < ∞, E |f (Z)| < ∞.

Take a piecewise linear function g that takes value 1 in [−1, 1], 0 outside [−2, 2], and between
0 and 1 elsewhere. Let

fn (x) := f (x)g(x/n).

Then clearly,

|fn (x)| ≤ |f (x)| for all x and fn (x) → f (x) pointwise.

Similarly, fn0 → f 0 pointwise. Rest follows by DCT. The last step is to show that the
finiteness of E |f 0 (Z)| implies the finiteness of the other two expectations.

Suppose E |f 0 (Z)| < ∞. Then


Z ∞ Z ∞ Z x
−x2 /2 f (y) dy e−x2 /2 dx
0
|xf (x)| e dx ≤ x
0
Z0 ∞ 0
0 ∞ −x2 /2
Z
= f (y) xe dx dy.
0 y
| {z }
2 /2
e−y

Finiteness of E |f (Z)| follows from the inequality |f (x)| ≤ sup|t|≤1 |f (t)| + |xf (x)|. 2

Exercise 2: Find f s.t. E |Zf (Z)| < ∞ but E |f 0 (Z)| = ∞.

Next time, Stein’s method. Sketch:

Suppose you have a r.v. W and Z ∼ N (0, 1) and you want to bound

sup |E g(W ) − E g(Z)| ≤ sup E f 0 (W ) − W f (W )



g∈D f ∈D0

Main difference between stein’s method and characteristic functions is that Stein’s method
is a local technique. We transfer a global problem to a local problem. It’s a theme that is
present in many branches of mathematics.

2-4

You might also like