K-Nearest-Neighbors Regression: I2n (X) I K 1 N

k-nearest-neighbors regression is a basic machine learning method where the value predicted for a new data point is the average of the values of the k nearest neighbors. The number of neighbors k can be varied, with smaller k giving a more flexible fit and larger k a less flexible fit. The k-NN estimate is discontinuous and jagged, especially for small k, because the weights assigned to each training point are discontinuous functions of the input. k-NN regression is considered a linear smoother and is universally consistent under certain conditions on k growing with sample size n. If the underlying function is Lipschitz continuous, the error of k-NN decays at a rate of n-2/(2+d)

Uploaded by

Wardatul Maghfiroh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

84 views3 pages

K-Nearest-Neighbors Regression: I2n (X) I K 1 N

Uploaded by

Wardatul Maghfiroh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

k-nearest-neighbors regression

Here's a basic method to start us o_: k-nearest-neighbors regression. We _x an integer

k _ 1 and de_ne
bm(x) =
1
k
X
i2Nk(x)
Yi; (4)
where Nk(x) contains the indices of the k closest points of X1; : : : ;Xn to x.

This is not at all a bad estimator, and you will _nd it used in lots of applications, in
many cases probably because of its simplicity. By varying the number of neighbors k, we can
achieve a wide range of exibility in the estimated function bm, with small k corresponding
to a more exible _t, and large k less exible.
But it does have its limitations, an apparent one being that the _tted function bm
essentially always looks jagged, especially for small or moderate k. Why is this? It helps to
write
bm(x) =
Xn
i=1
wi(x)Yi; (5)
where the weights wi(x), i = 1; : : : ; n are de_ned as
wi(x) =
(
1=k if Xi is one of the k nearest points to x
0 else.
Note that wi(x) is discontinuous as a function of x, and therefore so is bm(x).
The representation (5) also reveals that the k-nearest-neighbors estimate is in a class of
estimates we call linear smoothers, i.e., writing Y = (Y1; : : : ; Yn) 2 Rn, the vector of _tted
values
b_ = ( bm(X1); : : : ; bm(Xn)) 2 Rn
can simply be expressed as b_ = SY . (To be clear, this means that for _xed inputs X1; : : : ;Xn,
the vector of _tted values b_ is a linear function of Y ; it does not mean that bm(x) need behave
linearly as a function of x.) This class is quite large, and contains many popular estimators,
as we'll see in the coming sections.
The k-nearest-neighbors estimator is universally consistent, which means Ek bm 􀀀 m0k22
!0
as n ! 1, with no assumptions other than E(Y 2) _ 1, provided that we take k = kn such
that kn ! 1 and kn=n ! 0; e.g., k = pn will do. See Chapter 6.2 of Gyor_ et al. (2002).
Furthermore, assuming the underlying regression function m0 is Lipschitz continuous,
the k-nearest-neighbors estimate with k _ n2=(2+d) satis_es
Ek bm 􀀀 m0k22. n􀀀2=(2+d): (6)
See Chapter 6.3 of Gyor_ et al. (2002). Later, we will see that this is optimal.
Proof sketch: assume that Var(Y jX = x) = _2, a constant, for simplicity, and _x
(condition on) the training points. Using the bias-variance tradeo_,
E
_􀀀
bm(x) 􀀀 m0(x)
_2_
=
􀀀
E[ bm(x)] 􀀀 m0(x)
_2
| {z }
Bias2( bm(x))
+E
_􀀀
bm(x) 􀀀 E[ bm(x)]
_2_
| {z }
Var( bm(x))
=
_
1
k
X
i2Nk(x)
􀀀
m0(Xi) 􀀀 m0(x)
__2
+
_2
k
_
_
L
k
X
i2Nk(x)
kXi 􀀀 xk2
_2
+
_2
k
:
In the last line we used the Lipschitz property jm0(x) 􀀀 m0(z)j _ Lkx 􀀀 zk2, for some
constant L > 0. Now for \most" of the points we'll have kXi 􀀀 xk2 _ C(k=n)1=d, for a

constant C > 0. (Think of a having input points Xi, i = 1; : : : ; n spaced equally over (say)
[0; 1]d.) Then our bias-variance upper bound becomes
(CL)2
_
k
n
_2=d
+
_2
k
;
We can minimize this by balancing the two terms so that they are equal, giving k1+2=d _ n2=d,
i.e., k _ n2=(2+d) as claimed. Plugging this in gives the error bound of n􀀀2=(2+d), as claimed.

Ch2 NonParametricRegression Part2
No ratings yet
Ch2 NonParametricRegression Part2
19 pages
Kernel Smoothing & Regression Guide
No ratings yet
Kernel Smoothing & Regression Guide
5 pages
Cours KNN
No ratings yet
Cours KNN
10 pages
2IIG0 Cheat Sheet 1
No ratings yet
2IIG0 Cheat Sheet 1
2 pages
Intro To Regression
No ratings yet
Intro To Regression
4 pages
Intro&NP Stat
No ratings yet
Intro&NP Stat
122 pages
Cours2 ML
No ratings yet
Cours2 ML
21 pages
Weather Wax Hastie Solutions Manual
No ratings yet
Weather Wax Hastie Solutions Manual
18 pages
EE353 - 769 10 Kernel SVM
No ratings yet
EE353 - 769 10 Kernel SVM
20 pages
Estimation of Time-Varying Par in STAT Models - Bertsimas Et - Al. (1999) - PUB
No ratings yet
Estimation of Time-Varying Par in STAT Models - Bertsimas Et - Al. (1999) - PUB
21 pages
Stats 205 Notes
No ratings yet
Stats 205 Notes
99 pages
Adapting To Unknown Smoothness: R. M. Castro May 20, 2011
No ratings yet
Adapting To Unknown Smoothness: R. M. Castro May 20, 2011
9 pages
Smoothing: Smooth
No ratings yet
Smoothing: Smooth
19 pages
Linear Regression
No ratings yet
Linear Regression
19 pages
Econometric Theory: Module - Ii
No ratings yet
Econometric Theory: Module - Ii
11 pages
Lecture 11
No ratings yet
Lecture 11
32 pages
Slides3part2 mrbm2324
No ratings yet
Slides3part2 mrbm2324
23 pages
Applied Nonparametric Regression: Wolfgang H Ardle
No ratings yet
Applied Nonparametric Regression: Wolfgang H Ardle
433 pages
Nonparametric Regression Guide
No ratings yet
Nonparametric Regression Guide
433 pages
(eBook-PDF) - Statistics - Applied Nonparametric Regression
No ratings yet
(eBook-PDF) - Statistics - Applied Nonparametric Regression
433 pages
Applied Nonparametric Regression
No ratings yet
Applied Nonparametric Regression
433 pages
Econometrics Simpler Note
No ratings yet
Econometrics Simpler Note
692 pages
The Annals of Statistics 10.1214/009053606000000830 Institute of Mathematical Statistics
No ratings yet
The Annals of Statistics 10.1214/009053606000000830 Institute of Mathematical Statistics
22 pages
Forex Algorithm
No ratings yet
Forex Algorithm
5 pages
XXXX Statistical Estimation
No ratings yet
XXXX Statistical Estimation
87 pages
HW 6
No ratings yet
HW 6
5 pages
Nonparametric Regression
No ratings yet
Nonparametric Regression
7 pages
Linear Regression: 1 1 N N I I I D I I
No ratings yet
Linear Regression: 1 1 N N I I I D I I
20 pages
Creel M Econometrics
No ratings yet
Creel M Econometrics
479 pages
Lect 6
No ratings yet
Lect 6
20 pages
Graduate Econometrics Lecture Notes - Michael Creel (414 Pages)
100% (1)
Graduate Econometrics Lecture Notes - Michael Creel (414 Pages)
414 pages
Local Linear Regression For Functional Data: Alain Berlinet, Abdallah Elamine, André Mas Université Montpellier 2
No ratings yet
Local Linear Regression For Functional Data: Alain Berlinet, Abdallah Elamine, André Mas Université Montpellier 2
23 pages
Advanced Online Learning Algorithms
No ratings yet
Advanced Online Learning Algorithms
125 pages
Ebook Econometrics
No ratings yet
Ebook Econometrics
1,006 pages
Non Parametric Methods 8
No ratings yet
Non Parametric Methods 8
23 pages
Supervised Learning: Instance Based Learning
No ratings yet
Supervised Learning: Instance Based Learning
16 pages
Yirun Fu Reaserch Paper
No ratings yet
Yirun Fu Reaserch Paper
42 pages
Nonparametric Regression Methods
No ratings yet
Nonparametric Regression Methods
40 pages
Unit 4
No ratings yet
Unit 4
20 pages
Statlearn PDF
No ratings yet
Statlearn PDF
123 pages
Machine Learning
No ratings yet
Machine Learning
662 pages
Non Par Regression
No ratings yet
Non Par Regression
35 pages
How To Choose The Covariance For Gaussian Process Regres-Sion Independently of The Basis
No ratings yet
How To Choose The Covariance For Gaussian Process Regres-Sion Independently of The Basis
4 pages
Practice 1130
No ratings yet
Practice 1130
20 pages
Kernel Ridge Regression Guide
No ratings yet
Kernel Ridge Regression Guide
3 pages
Machine Learning Basics for Beginners
No ratings yet
Machine Learning Basics for Beginners
53 pages
Adv Stat Inf
No ratings yet
Adv Stat Inf
194 pages
SVM Incremental Learning, Adaptation and Optimization - IJCNN 2003 Presentation
No ratings yet
SVM Incremental Learning, Adaptation and Optimization - IJCNN 2003 Presentation
11 pages
Support Vector Machines (SVMS) Are Learning Algorithms That Use A Hypothesis Space
No ratings yet
Support Vector Machines (SVMS) Are Learning Algorithms That Use A Hypothesis Space
2 pages
Harry Potter Project Videos Extracts Ws 5 67820
No ratings yet
Harry Potter Project Videos Extracts Ws 5 67820
3 pages
History - Chapter 8
No ratings yet
History - Chapter 8
6 pages
Botanica
No ratings yet
Botanica
15 pages
PIS Admin Manual PDF
No ratings yet
PIS Admin Manual PDF
63 pages
Presentation - Durga Puja - A Celebration of Divine Power and Victory
0% (1)
Presentation - Durga Puja - A Celebration of Divine Power and Victory
9 pages
Study On The Physical Training of The Junior Gymnasts III: Liliana NANU, Gabriel GHEORGHIU
No ratings yet
Study On The Physical Training of The Junior Gymnasts III: Liliana NANU, Gabriel GHEORGHIU
5 pages
ELECTION LAW Summary Nachura
No ratings yet
ELECTION LAW Summary Nachura
18 pages
Finite Diference Solutions of Seepage Problems
No ratings yet
Finite Diference Solutions of Seepage Problems
17 pages
Why Buildings Are Sentient and Evil (Extended Phenotype & Living Systems Theory) Agora Road's Macintosh Cafe
No ratings yet
Why Buildings Are Sentient and Evil (Extended Phenotype & Living Systems Theory) Agora Road's Macintosh Cafe
19 pages
Assignement 2 Aregrag Ilias
No ratings yet
Assignement 2 Aregrag Ilias
12 pages
Abacus Guide Book
No ratings yet
Abacus Guide Book
23 pages
Risk & Return: Risk of A Portfolio-Uncertainty Main View
No ratings yet
Risk & Return: Risk of A Portfolio-Uncertainty Main View
47 pages
Akapulko: Herbal Medicines Approved by DOH
No ratings yet
Akapulko: Herbal Medicines Approved by DOH
4 pages
Foundations of Psychotherapy
No ratings yet
Foundations of Psychotherapy
10 pages
Forest Spa Agreement
No ratings yet
Forest Spa Agreement
7 pages
CC Exercises
No ratings yet
CC Exercises
2 pages
Individualized Education Plan
100% (4)
Individualized Education Plan
7 pages
CAT4 Level C Sample Questions
100% (2)
CAT4 Level C Sample Questions
5 pages
Analysis of Marketing Strategy of Eastern Bank Limited
100% (7)
Analysis of Marketing Strategy of Eastern Bank Limited
55 pages
Egra Consolidation Blank Grade 2
No ratings yet
Egra Consolidation Blank Grade 2
3 pages
Indian Express Newspapers ... Vs Union of India & Ors. Etc. Etc On 6 December, 1984
No ratings yet
Indian Express Newspapers ... Vs Union of India & Ors. Etc. Etc On 6 December, 1984
62 pages
CSC - Individual Assignment 01-Tuyet Nhung
No ratings yet
CSC - Individual Assignment 01-Tuyet Nhung
3 pages
Jio Airfiber Channels 13.3.25
No ratings yet
Jio Airfiber Channels 13.3.25
14 pages
Mental Health and Bullying PPT Final
No ratings yet
Mental Health and Bullying PPT Final
26 pages
Erlang C Table PDF
0% (1)
Erlang C Table PDF
2 pages
Modern Dramatists PDF
No ratings yet
Modern Dramatists PDF
22 pages
My Mother at 66
No ratings yet
My Mother at 66
5 pages
On Art Integration
No ratings yet
On Art Integration
49 pages
The Handmaid's Tale
No ratings yet
The Handmaid's Tale
40 pages
Brain Fingerprinting
No ratings yet
Brain Fingerprinting
3 pages

K-Nearest-Neighbors Regression: I2n (X) I K 1 N

Uploaded by

K-Nearest-Neighbors Regression: I2n (X) I K 1 N

Uploaded by

k-nearest-neighbors regression

Here's a basic method to start us o_: k-nearest-neighbors regression. We _x an integer

You might also like