K-Nearest Neighbor
Algorithm
Presented by:
Saqlain Younas – Reg # SU-15-01-002-015
Bakhti Rehman – Reg# SU-15-01-002-017
Syed Tanveer Haider – Reg# SU-15-01-002-015
PRESENTATION OUTLINE 2
Introduction
Classification Approach
Working
How to Choose “K”?
Advantages
Disadvantages
Applications
References
Sarhad University
Peshawar
INTRODUCTION 3
KNN is a Powerful classification algorithm used in Machine
Learning.
It belongs to the Supervised Learning domain.
K nearest neighbors stores all available cases (Existing Labeled
Data) and classifies as new cases (Test Data) based on a
similarity measure.
[1]
Sarhad University
Peshawar
KNN: CLASSIFICATION APPROACH 4
Given M training vectors, KNN algorithm identifies its “K”
nearest neighbors, regardless of the labels.
Distance between Neighbors are measured Using Distance
Functions.
Sarhad University
Peshawar
KNN: DISTANCE FUNCTIONS 5
To Measure the distance between points A and B in a feature
space, 2 major functions are used:
Euclidian Distance Function
Cosine Similarity Measure
Let “A” and “B” are represented by Feature Vectors A=(x1,
x2,x3,….,xm) and B = (y1, y2, y3,…., ym), where m is the
dimentionality of the feature space.
Sarhad University
Peshawar
KNN: DISTANCE FUNCTIONS 6
Euclidian Distance Function
Cosine Similarity Measure
Sarhad University
Peshawar
KNN: OTHER DISTANCE FUNCTIONS 7
Mickowsky
Correlation
Chi-square
Sarhad University
Peshawar
WORKING (CLASSIFICATION) 8
(Weight) W
Example
Let K = 3
Find Class of New Tumer?
(Diameter) D
Benign Tumer (Lower Diameter and Lower Weight).
Malignant Tumer (Higher Diameter and Higher Weight)
Sarhad University
Peshawar
WORKING (CLASSIFICATION) 9
W
Example
Let K = 5
Find Class of New Tumer?
D
Benign Tumer (Lower Diameter and Lower Weight).
Malignant Tumer (Higher Diameter and Higher Wieght)
Sarhad University
Peshawar
HOW TO CHOOSE “K” ? 10
“K” Must be an positive odd number (to avoid 2 Class
Problem).
“K” should be a larger number.
Sarhad University
Peshawar
ADVANTAGES OF KNN ALGORITHM 11
Simple to Understand and easy to Implement
Can be applied to different types of data.
Good classification if the number of samples is large enough.
First choice for a classification study when there is little of no
prior knowledge about the distributions data.
[2]
Sarhad University
Peshawar
DISADVANTAGES 12
Takes more time to Classify a new Example (need to calculate
and compare new distance for each of vectors).
Choosing K maybe tricky.
Need large number of samples for accuracy.
Also Computationally expensive to find the k nearest
neighbors when the dataset is very large.
[3]
Sarhad University
Peshawar
APPLICATIONS 13
Text Mining
Agriculture
Finance
Medicine
Pattern Recognition
[6]
Sarhad University
Peshawar
CONCLUSION 14
KNN is vey easy and effective to implement.
K-Nearest neighbor classification is a general technique to
learn classification.
K-NN classification is based on measuring the distance
between the test data and each of the training data, the
chosen distance function can affect the classification accuracy.
However, classifications process could be very expensive
because it need to compute the similarity values individually
between the test and training examples.
[4]
Sarhad University
Peshawar
REFERENCES 15
[1]
https://www.google.com/url?sa=i&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwiCxpb6xP7fAhXtMewKHW27BXAQjRx6BAgBEAU&url=ht
tps%3A%2F%2Fwww.researchgate.net%2Ffigure%2FKNN-algorithm-diagram_fig8_301299690&psig=AOvVaw0pp5LA9867vjhc54IN5--
Y&ust=1548148225663608
[2]
https://www.google.com/url?sa=i&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwibhIDBxf7fAhXIzKQKHWAsCZ0QjRx6BAgBEAU&url=http
%3A%2F%2Fsystemprotection.in%2Fadvantages-high-efficiency-
transformers%2F&psig=AOvVaw0OAawbEalwrI2PPmfPhkpy&ust=1548146639323306
[3]
https://www.google.com/url?sa=i&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwj9w5PAxf7fAhUOyaQKHdhMAuYQjRx6BAgBEAU&url=ht
tps%3A%2F%2Fwww.dreamstime.com%2Fdisadvantage-red-stamp-isolated-white-background-disadvantage-red-stamp-
image105120526&psig=AOvVaw2C2NyqQ6IrBqERPjwhXvwP&ust=1548146887293667
[4]
https://www.google.com/url?sa=i&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwjs1JqWxv7fAhULKewKHXVkDe8QjRx6BAgBEAU&url=htt
p%3A%2F%2Fmassagroup.co%2Fclip-art-data-and-conclusions.html&psig=AOvVaw3QwDp-4pZi11GkRJH_zGPI&ust=1548148875739266
[5] (Application of K-Nearest Neighbor (KNN) Approach for Predicting Economic Events: Theoretical Background
Sadegh Bafandeh Imandoust And Mohammad Bolandraftar Department of Economics,Payame Noor University, Tehran, Iran
) https://www.ijera.com/papers/Vol3_issue5/DI35605610.pdf
[6] https://familybuildingblocks.org/media/filer_public_thumbnails/filer_public/33/cc/33cca25c-c499-40b9-ac79-
176fcb9029cd/applications.png__900x300_q85_crop_subsampling-2_upscale.png
Sarhad University
Peshawar
REFERENCES (CONT…) 16
https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm
https://machinelearningmastery.com/k-nearest-neighbors-for-machine-learning/
https://saravananthirumuruganathan.wordpress.com/2010/05/17/a-detailed-introduction-to-k-nearest-neighbor-knn-
algorithm/
https://kevinzakka.github.io/2016/07/13/k-nearest-neighbor/
https://slideplayer.com/slide/7364467/?fbclid=IwAR3AHAvGXxInp-
TvvZGvFQmHNP0yv92p96N6IQasXmLVcYIMnwjloWTenVM
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4978658/
Sarhad University
Peshawar
Questions ?
Sarhad University
Peshawar
THANK YOU!
Sarhad University
Peshawar