0% found this document useful (0 votes)

54 views29 pages

Kde Slides

This document discusses nonparametric density estimation using kernel density estimation (KDE). It begins with an introduction to histograms and their problems for density estimation. It then introduces KDE, which uses a kernel function to smooth the histogram and provide a continuous density estimate. It discusses choosing the bandwidth and presents examples of KDE, including estimating disease risk based on glucose levels.

Uploaded by

Sue

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views29 pages

Kde Slides

Uploaded by

Sue

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Nonparametric Density Estimation

October 1, 2018
Introduction

I If we can’t fit a distribution to our data, then we use

nonparametric density estimation.
I Start with a histogram.
I But there are problems with using histrograms for density
estimation.
I A better method is kernel density estimation.
I Let’s consider an example in which we predict whether
someone has diabetes based on their glucode concentration.
I We can also use kernel density estimation with naive Bayes or
other probabilistic learners.
Introduction
I Plot of plasma glucose concentration (GLU) for a population
of women who were at least 21 years old, of Pima Indian
heritage and living near Phoenix, Arizona, with no evidence of
diabetes:
No Diabetes
14
12
10
Counts

8
6
4
2
0
0 50 100 150 200 250
GLU
Introduction

I Assume we want to determine if a person’s GLU is abnormal.

I The population was tested for diabetes according to World
Health Organization criteria.
I The data were collected by the US National Institute of
Diabetes and Digestive and Kidney Diseases.
I First, are these data distributed normally?
I No, according to a χ2 test of goodness of fit.
Histograms

I A histogram is a first (and rough) approximation to an

unknown probability density function.
I We have a sample of n observations, X1 , . . . , Xi , . . . , Xn .
I An important parameter is the bin width, h.
I Effectively, it determines the width of each bar.
I We can have thick bars or thin bars, obviously.
I h determines how much we smooth the data.
I Another parameter is the origin, x0 .
I x0 determines where we start binning data.
I This obviously effects the number of points in each bin.
I We can plot a histogram as
I the number of items in each bin or
I the proportion of the total for each bin
Histograms

I We define a bins or intervals as

[x0 + mh, x0 + (m + 1)h] for m ∈ Z

(i.e., the positive and negative integers).

I But for our purposes, it’s best to plot the relative frequency
1
fˆ(x) = (number of Xi in same bin as x)
nh
I Notice that this is the density estimate for x.
Problems with Histograms

I One program with using histograms as an estimate of the

PDF is there can be discontinuities.
I For example, if we have a bin with no counts, then its
probability is zero.
I This is also a problem “at the tails” of the distribution, the
left and right side of the histogram.
I First off, with real PDFs, there are no impossible events (i.e.,
events with probability zero).
I There are only events with extremely small probabilities.
I The histogram is discrete, rather than continuous, so
depending on the smoothing factor, there could be large
jumps in the density with very small changes in x.
I And depending on the bin width, the density may not change
at all with reasonably large changes to x.
Kernel Density Estimator: Motivation
I Research has shown that a kernel density estimator for
continuous attributes improve the performance of naive Bayes
over Gaussian distributions [John and Langley, 1995].
I KDE is more expensive in time and space than a Gaussian
estimator, and the result is somewhat intuitive: If the data do
not follow the distributional assumptions of your model, then
performance can suffer.
I With KDE, we start with a histogram, but when we estimate
the density of a value, we smooth the histogram using a
kernel function.
I Again, start with the histogram.
I A generalization of the histogram method is to use a function
to smooth the histogram.
I We get rid of discontinuities.
I If we do it right, we get a continuous estimate of the PDF.
Kernel Density Estimator
[McLachlan, 1992, Silverman, 1998]
I Given the sample Xi and the observation x
n
ˆ 1 X x − Xi
f (x) = K ,
nh h
i=1

where h is the window width, smoothing parameter, or

bandwidth.
I K is a kernel function, such that
Z ∞
K (x) dx = 1
−∞

I One popular choice for K is the Gaussian kernel

1 2
K (t) = √ e −(1/2)t .
2π
I One of the most important decisions is the bandwidth (h).
I We can just pick a number based on what looks good.
Kernel Density Estimator

Source: https://en.wikipedia.org/wiki/Kernel density estimation

Algorithm for KDE

I Representation: The sample Xi for i = 1, . . . , n.

I Learning: Add a new sample to the collection.
I Performance:
n
1 X x − Xi
fˆ(x) = K ,
nh h
i=1

where h is the window width, smoothing parameter, or

bandwidth, and K is a kernel function, such as the Gaussian
kernel
1 2
K (t) = √ e −(1/2)t .
2π
Kernel Density Estimator

public double getProbability( Number x ) {

int n = this.X.size();
double Pr = 0.0;
for ( int i = 0; i < n; i++ ) {
Pr += X.get(i) * Gaussian.pdf((x - X.get(i)) / this.h );
} // for
return Pr / ( n * this.h );
} // KDE::getProbability
Automatic Bandwidth Selection
I Ideally, we’d like to set h based on the data.
I This is called automatic bandwidth selection.
I Silverman’s [1998] rule-of-thumb method estimates h as
1/5
4σ̂ 5

hˆ0 = ≈ 1.06σ̂n−1/5 ,
3n

where σ̂ is the sample standard deviation and n is the number

of samples.
I Silverman’s rule of thumb assumes that the kernel is Gaussian
and that the underlying distribution is normal.
I This latter assumption may not be true, but we get a simple
expression that evaluates in constant time, and it seems to
perform well.
I Evaluating in constant time doesn’t include the time it takes
to compute σ̂, but we can compute σ̂ as we read the samples.
Automatic Bandwidth Selection

I Sheather and Jones’ [1991] solve-the-equation plug-in method

is a bit more complicated.
I It’s O(n2 ), and we have to solve numerically a set of
equations, which could fail.
I It is regarded as theoretically and empirically, the best method
we have.
Simple KDE Example
I Determine if a person’s GLU is abnormal.

No Diabetes
14
12
10
Counts

8
6
4
2
0
0 50 100 150 200 250
GLU
Simple KDE Example
I Green line: Fixed value, h = 1
I Magenta line: Sheather and Jones’ method, h = 1.5
I Blue line: Silverman’s method, h = 7.95

No Diabetes
0.04
Observations
0.035 h=1
0.03 Sheather (h = 1.5)
Est. Density

0.025 Silverman (h = 7.95)

0.02
0.015
0.01
0.005
0
0 50 100 150 200 250
GLU
Simple KDE Example

I Assume h = 7.95
I fˆ(100) = 0.018
I fˆ(250) = 3.3 × 10−14
R 100
I P(0 ≤ x ≤ 100) = 0 fˆ(x) dx
P(0 ≤ x ≤ 100) = 100 fˆ(x) dx
P
I
0
I P(0 ≤ x ≤ 100) ≈ 0.393
Naive Bayes with KDEs
I Assume we have GLU measurements for women with and
without diabetes.
I Plot of women with diabetes:

Diabetes
6
5
4
Counts

3
2
1
0
0 50 100 150 200 250
GLU
Naive Bayes with KDEs
I Plot of women without:

No Diabetes
14
12
10
Counts

8
6
4
2
0
0 50 100 150 200 250
GLU
Naive Bayes with KDEs

I The task is to determine, given a woman’s GLU measurement,

if it is more likely that she has diabetes (or vice versa).
I For this, we can use Bayes’ rule.
I Like before, we build a kernel density estimator for both sets
of data.
Naive Bayes with KDEs
I Without diabetes:

No Diabetes
0.04
Observations
0.035 h=1
0.03 Sheather (h = 1.5)
Est. Density

0.025 Silverman (h = 7.95)

0.02
0.015
0.01
0.005
0
0 50 100 150 200 250
GLU
I Silverman’s rule of thumb gives hˆ0 = 7.95
Naive Bayes with KDEs
I With diabetes:

Diabetes
0.035
Observations
0.03 Sheather (h = 1.5)
0.025 h=1
Est. Density

Silverman (h = 11.77)
0.02
0.015
0.01
0.005
0
0 50 100 150 200 250
GLU
I Silverman’s rule of thumb gives hˆ1 = 11.77
Naive Bayes with KDEs
I All together:
0.018
0.016
0.014
Est. Density

0.012
0.01
0.008
0.006
0.004
0.002
0
0 50 100 150 200 250
GLU
Naive Bayes with KDEs

I Now that we’ve built these kernel density estimators, they give
us P(GLU|Diabetes = true) and P(GLU|Diabetes = false).
Naive Bayes with KDEs

I We now need to calculate the base rate or the prior

probability of each class.
I There are 355 samples of women without diabetes, and 177
samples of women with diabetes.
I Therefore,
177
P(Diabetes = true) = = .332
177 + 355
I And,
355
P(Diabetes = false) = = .668
177 + 355
I Or,

P(Diabetes = false) = 1−P(Diabetes = true) = 1−.332 = .668

Naive Bayes with KDEs

I Bayes rule:

P(D)P(GLU|D)
P(D|GLU) =
P(D)P(GLU|D) + P(¬D)P(GLU|¬D)
Naive Bayes with KDEs
I Plot of the posterior distribution:

Posterior Distribution
1
0.9
0.8
0.7
Probability

0.6
0.5
0.4
0.3
0.2
0.1
0
0 50 100 150 200 250
GLU
Naive Bayes with KDEs

I P(D|GLU = 50)?

(.332)(2.73E − 5)
P(D|GLU = 50) = = .0385
(.332)(2.73E − 5) + (.668)(3.39E − 4)
I P(D|GLU = 175)?

(.332)(.009)
P(D|GLU = 175) = = .854
(.332)(.009) + (.668)(7.65E − 4)
References

G. H. John and P. Langley. Estimating continuous distributions in Bayesian

classifiers. In Proceedings of the Eleventh Conference on Uncertainty in
Artificial Intelligence, pages 338–345, San Francisco, CA, 1995. Morgan
Kaufmann.
G. J. McLachlan. Discriminant Analysis and Statistical Pattern Recognition.
John Wiley & Sons, New York, NY, 1992.
S. J. Sheather and M. C. Jones. A reliable data-based bandwidth selection
method for kernel density estimation. Journal of the Royal Statistical
Society. Series B (Methodological), 53(3):683–690, 1991.
B. W. Silverman. Density estimation for statistics and data analysis, volume 26
of Monographs on statistics and applied probability. Chapman & Hall/CRC,
Boca Raton, FL, 1998.

TEAA - Memory Based Tecniques
No ratings yet
TEAA - Memory Based Tecniques
23 pages
Kernel Density Estimation (KDE) in Excel Tutorial
No ratings yet
Kernel Density Estimation (KDE) in Excel Tutorial
8 pages
Ast Part1 PDF
No ratings yet
Ast Part1 PDF
20 pages
Non Parametric Density Estimation
No ratings yet
Non Parametric Density Estimation
4 pages
Non-Parametric Density Estimation
No ratings yet
Non-Parametric Density Estimation
3 pages
13 Density Estimation Note
No ratings yet
13 Density Estimation Note
48 pages
The Study of Different Types of Kernel Density Estimators: Minge Sha, Yonggang Xie
No ratings yet
The Study of Different Types of Kernel Density Estimators: Minge Sha, Yonggang Xie
5 pages
Non-Parametric Methods Using Kernel Density Estimation
No ratings yet
Non-Parametric Methods Using Kernel Density Estimation
1 page
Density Estimation
No ratings yet
Density Estimation
17 pages
(Bernard. W. Silverman) Density Estimation For Sta
No ratings yet
(Bernard. W. Silverman) Density Estimation For Sta
92 pages
Kernel Density Estimation - Wikipedia
No ratings yet
Kernel Density Estimation - Wikipedia
11 pages
Chapter One
100% (1)
Chapter One
46 pages
Efficient KDE for HPC Systems
No ratings yet
Efficient KDE for HPC Systems
17 pages
U4 ProbabilityDensityEstimation
No ratings yet
U4 ProbabilityDensityEstimation
6 pages
Density Estimation Is A Statistical Technique Used
No ratings yet
Density Estimation Is A Statistical Technique Used
16 pages
Lec7 Density PDF
No ratings yet
Lec7 Density PDF
9 pages
Articulo Sheather
No ratings yet
Articulo Sheather
11 pages
Towardsdatascience Com The Math Behind Kernel Density Estimation 5deca75cba38 ...
No ratings yet
Towardsdatascience Com The Math Behind Kernel Density Estimation 5deca75cba38 ...
26 pages
Comprehensiv Questions Solved
No ratings yet
Comprehensiv Questions Solved
28 pages
Empirical Finance1
No ratings yet
Empirical Finance1
31 pages
Kernel Density Estimation
No ratings yet
Kernel Density Estimation
10 pages
Econometricians' Guide to KDE
No ratings yet
Econometricians' Guide to KDE
35 pages
Simon Sheather 2004 PDF
No ratings yet
Simon Sheather 2004 PDF
10 pages
1D Kernel Density Estimation
No ratings yet
1D Kernel Density Estimation
9 pages
A Tutoriol On KDE and Recent Advances
No ratings yet
A Tutoriol On KDE and Recent Advances
28 pages
Estimating Distributions and Densities: 36-350, Data Mining, Fall 2009 23 November 2009
No ratings yet
Estimating Distributions and Densities: 36-350, Data Mining, Fall 2009 23 November 2009
7 pages
AMC Technical Brief 4 (Kernel Density Estimation Using Kernel - Xla)
No ratings yet
AMC Technical Brief 4 (Kernel Density Estimation Using Kernel - Xla)
2 pages
Advanced Data Analysis Techniques
No ratings yet
Advanced Data Analysis Techniques
20 pages
Non-Parametric Methods
No ratings yet
Non-Parametric Methods
51 pages
Nonparametric Methods: Jason Corso
No ratings yet
Nonparametric Methods: Jason Corso
49 pages
On Density Estimation
No ratings yet
On Density Estimation
4 pages
Lecture 19-NonParametricDensity
No ratings yet
Lecture 19-NonParametricDensity
18 pages
Tabak Turner
No ratings yet
Tabak Turner
20 pages
M3 DensityEstimation v1
No ratings yet
M3 DensityEstimation v1
65 pages
Getdist: Kernel Density Estimation: Url: Http://Cosmologist - Info
No ratings yet
Getdist: Kernel Density Estimation: Url: Http://Cosmologist - Info
11 pages
Kernel Density Estimation and Its Application
No ratings yet
Kernel Density Estimation and Its Application
8 pages
Densityestimation
No ratings yet
Densityestimation
28 pages
Intro To Kernel Density Estimation
No ratings yet
Intro To Kernel Density Estimation
4 pages
05 Density Estimation
No ratings yet
05 Density Estimation
29 pages
Histogram: Nonparametric Kernel Density Estimation
No ratings yet
Histogram: Nonparametric Kernel Density Estimation
19 pages
A Primer in Nonparametric Econometrics
No ratings yet
A Primer in Nonparametric Econometrics
88 pages
Racine - 2007 - Nonparametric Econometrics A Primer
No ratings yet
Racine - 2007 - Nonparametric Econometrics A Primer
88 pages
Advanced Density Estimation Guide
No ratings yet
Advanced Density Estimation Guide
32 pages
Parameter Estimation - PR
No ratings yet
Parameter Estimation - PR
66 pages
Izenman 1991
No ratings yet
Izenman 1991
21 pages
A Review of Kernel Density Estimation With Applications To Econometrics (#278024) - 259389
No ratings yet
A Review of Kernel Density Estimation With Applications To Econometrics (#278024) - 259389
23 pages
Chap 4
No ratings yet
Chap 4
21 pages
2024-Fourier Basis Density Model
No ratings yet
2024-Fourier Basis Density Model
5 pages
Conditional Density Estimation With Neural Network
No ratings yet
Conditional Density Estimation With Neural Network
41 pages
Introduction To Kernel Smoothing
No ratings yet
Introduction To Kernel Smoothing
24 pages
Introduction To Kernel Smoothing
100% (1)
Introduction To Kernel Smoothing
24 pages
Day 3
No ratings yet
Day 3
19 pages
ML Unit-4
No ratings yet
ML Unit-4
29 pages
Nonparametric Statistics Epiphany 2024-25
No ratings yet
Nonparametric Statistics Epiphany 2024-25
102 pages
Statistical Computing: Set - Seed (1001) N 100 X Rlnorm (N)
No ratings yet
Statistical Computing: Set - Seed (1001) N 100 X Rlnorm (N)
11 pages
Descriptives: Descriptive Statistics
No ratings yet
Descriptives: Descriptive Statistics
5 pages
Factorial Design for Engineers
100% (1)
Factorial Design for Engineers
98 pages
Lesson 04
No ratings yet
Lesson 04
5 pages
MINIMUM DETECTABLE EFFECTS A Simple Way To Report The Statistical Power of Experimental Designs
No ratings yet
MINIMUM DETECTABLE EFFECTS A Simple Way To Report The Statistical Power of Experimental Designs
10 pages
M1112SP IVd 1
No ratings yet
M1112SP IVd 1
3 pages
Introduction To Probabilistic Reasoning
No ratings yet
Introduction To Probabilistic Reasoning
3 pages
Marketing Research CH-4
No ratings yet
Marketing Research CH-4
7 pages
CardioGood Treadmill Buyer Insights
No ratings yet
CardioGood Treadmill Buyer Insights
10 pages
Statistics Formula Sheet and Tables 2020
100% (1)
Statistics Formula Sheet and Tables 2020
6 pages
AB Testing Statistics Cheatsheet
No ratings yet
AB Testing Statistics Cheatsheet
3 pages
GIAN Course Broucher JUNE 2025
No ratings yet
GIAN Course Broucher JUNE 2025
2 pages
Lecture 4 Notes Final20180219203938
No ratings yet
Lecture 4 Notes Final20180219203938
21 pages
Design Rainfall Data and Analysis
No ratings yet
Design Rainfall Data and Analysis
213 pages
R Package synthpop: Create Synthetic Data
No ratings yet
R Package synthpop: Create Synthetic Data
26 pages
Hypothesis Testing in Research Methodolo PDF
No ratings yet
Hypothesis Testing in Research Methodolo PDF
3 pages
Mma43 - Mathematical Statistics
No ratings yet
Mma43 - Mathematical Statistics
3 pages
Introduction To Analysis of VarianceC
No ratings yet
Introduction To Analysis of VarianceC
35 pages
Five Instruments For Measuring Tree Height An Eval
No ratings yet
Five Instruments For Measuring Tree Height An Eval
8 pages
The Practice of Statistics 6th Edition Updated Daren S. Starnes available instanly
No ratings yet
The Practice of Statistics 6th Edition Updated Daren S. Starnes available instanly
153 pages
Hypothesis Testing Z-Test Z-Test: State The Hypotheses
No ratings yet
Hypothesis Testing Z-Test Z-Test: State The Hypotheses
9 pages
Confidence Intervals in Statistics
No ratings yet
Confidence Intervals in Statistics
4 pages
: µ = 0 vs H: µ 6= 0. Previous work shows that σ = 2. A change in BMI of 1.5 is considered important to detect (if the true effect size is 1.5 or higher
No ratings yet
: µ = 0 vs H: µ 6= 0. Previous work shows that σ = 2. A change in BMI of 1.5 is considered important to detect (if the true effect size is 1.5 or higher
5 pages
Child Abuse and Neglect Among Primary Schoolteachers
No ratings yet
Child Abuse and Neglect Among Primary Schoolteachers
5 pages
Regression
No ratings yet
Regression
4 pages
Social Factors Affecting Exclusive Breastfeeding
No ratings yet
Social Factors Affecting Exclusive Breastfeeding
19 pages
ANOVA and F-Test Explained
No ratings yet
ANOVA and F-Test Explained
5 pages
Unit - III Large Samples - Mean
No ratings yet
Unit - III Large Samples - Mean
34 pages
St. Paul College Foundation Inc. Bulacao Campus, Cebu City College of Education First Semester, 2018-2019
No ratings yet
St. Paul College Foundation Inc. Bulacao Campus, Cebu City College of Education First Semester, 2018-2019
9 pages
CUHK STAT5102 Ch2
No ratings yet
CUHK STAT5102 Ch2
25 pages
Campbell and Skillings 1985 Nonparametric Stepwise Multiple Comparison Procedures
No ratings yet
Campbell and Skillings 1985 Nonparametric Stepwise Multiple Comparison Procedures
7 pages

Kde Slides

Uploaded by

Kde Slides

Uploaded by

Nonparametric Density Estimation

I If we can’t fit a distribution to our data, then we use

I Assume we want to determine if a person’s GLU is abnormal.

I A histogram is a first (and rough) approximation to an

I We define a bins or intervals as

[x0 + mh, x0 + (m + 1)h] for m ∈ Z

(i.e., the positive and negative integers).

I One program with using histograms as an estimate of the

where h is the window width, smoothing parameter, or

I One popular choice for K is the Gaussian kernel

Source: https://en.wikipedia.org/wiki/Kernel density estimation

I Representation: The sample Xi for i = 1, . . . , n.

where h is the window width, smoothing parameter, or

public double getProbability( Number x ) {

where σ̂ is the sample standard deviation and n is the number

I Sheather and Jones’ [1991] solve-the-equation plug-in method

0.025 Silverman (h = 7.95)

I The task is to determine, given a woman’s GLU measurement,

0.025 Silverman (h = 7.95)

I We now need to calculate the base rate or the prior

P(Diabetes = false) = 1−P(Diabetes = true) = 1−.332 = .668

G. H. John and P. Langley. Estimating continuous distributions in Bayesian

You might also like