KEMBAR78
Lecture 12 | PDF | Probability Distribution | Statistical Analysis
0% found this document useful (0 votes)
55 views4 pages

Lecture 12

This document provides an overview of nonparametric methods and kernel density estimation. It discusses how nonparametric methods involve approximation or smoothing techniques controlled by a bandwidth parameter. Kernel density estimation is introduced as a way to estimate an unknown density function from a sample by counting observations within a bandwidth of each point. The kernel function determines the shape of the weighting and common second-order kernels like Epanechnikov and Gaussian are described with their formulas and properties.

Uploaded by

amanmatharu22
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views4 pages

Lecture 12

This document provides an overview of nonparametric methods and kernel density estimation. It discusses how nonparametric methods involve approximation or smoothing techniques controlled by a bandwidth parameter. Kernel density estimation is introduced as a way to estimate an unknown density function from a sample by counting observations within a bandwidth of each point. The kernel function determines the shape of the weighting and common second-order kernels like Epanechnikov and Gaussian are described with their formulas and properties.

Uploaded by

amanmatharu22
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Lecture - 12

January 29, 2012

Introduction
Nonparametric methods are typically involve some sort of approximation or smoothing method. Some of the main methods are called kernels, series, and splines. Nonparametric methods are typically indexed by a bandwidth or tunning parameter which controls the degree of complexity. The choice of bandwidth is often critical to implementation. Data-dependent rules for determination of the bandwidth are therefore essential for nonparametric methods. Nonparametric methods which require a bandwidth, are incomplete. Unfortunately this is quite common, due to the diculty in developing rigorous rules for bandwidth selection. Often in these cases the bandwidth is selected based on a related statistical problem. This is a feasible yet worrisome compromise. Many nonparametric problems are generalizations of univariate density estimation. We will start with this simple setting, and explore its considerable details.

Kernel Density Estimation


Discrete Estimator
Let X a random variable with continuous distribution F (x) and density f (x) = The goal is to estimate f (x) from a random sample {X1 , ......., Xn }. The distribution function F (x) is naturally estimated by the EDF F (x) = n1
n i=1 d F (x). dx

1(Xi
d F (x), dx

x). It might seem natural to estimate the density f (x) as the derivative of F (x),

but this estimator would be a set of mass points, and as such is not a useful estimate of f (x). Instead, consider a discrete derivative. For some small h > 0, let F (x + h) F (x h) f (x) = 2h We can write this as 1 2nh
n n

i=1

1 1(x + h < Xi x + h) = 2nh = 1 nh

1
i=1 n

|Xi x| 1 h Xi x h

k
i=1

Where k(u) = is the uniform density function on [1, 1]. The estimator f (x) counts the percentage of observations which are close to the point x. If many obervations are near x, then f (x) is large. Conversely, if only a few Xi are near x, then f (x) is small. The bandwidth h controls the degree of smoothing. f (x) is a special case of what is called a Kernel estimator. The general case is 1 f (x) = nh Where k(u) is a Kernel function. 2
n

, |u| 1 0 |u| > 1

1 2

k
i=1

Xi x h

Kernel Functions
A Kernel function k(u) : R R is any function which satises

k(u)du = 1. A

non-negative kernel satises k(u) 0 for all u. In this case, k(u) is a probability density function. The moments of kernel are kj (k) =

uj k(u)du.

A symmetric kernel function satises k(u) = k(u) for all u. In this case, all odd moments are zero. Most nonparametric estimation uses symmetric kernels, and we focus on this case. The order of a kernel, , is dened as the order of the rst non-zero moment. For example, if k1 (k) = 0 and k2 (k) > 0 then k is a second-order kernel and = 2. If k1 (k) = k2 (k) = k3 (k) = 0 and k4 (k) > 0 then k is fourth-order kernel and = 4. The order of a symmetric kernel is always even. Symmetric non-negative kernel are second-order kernels. A kernel is higher-order kernel if > 2 these kernels will have negative parts and are not probability densities. They are also refered to as bias-reducing kernels. Common Second-order Kernels

Kernel uniform Epanechnikov Biweight Triweight Gaussion

Equation k0 (u) = 1 1(|u| 1) 2 k1 (u) = 3 (1 u2 )1(|u| 1) 4 k2 (u) = k3 (u) =


15 (1 16 35 (1 32

R(k)
1 2 3 5 5 7 350 429 1 2

k2 (k)
1 3 1 5 1 7 1 9

u2 )2 1(|u| 1) u2 )3 1(|u| 1)
u 1 e 2 2 2

k4 (u) =

In addition to the kernel formula we will discuss its roughness R(k), second moment k2 (k). The roughness of a function is R(g) =

g 2 (u)du

The most commonly used kernels are the Epanechnikov and the Gaussion.

You might also like