0% found this document useful (0 votes)

69 views13 pages

SVM and Kernels

The document discusses kernels and the kernel trick in support vector machines. It explains that kernels allow computing the dot product of data points mapped into a higher dimensional feature space without explicitly performing the mapping. This is done through kernel functions that evaluate the similarity between pairs of data points. Common kernel functions include polynomial kernels and Gaussian radial basis function kernels. The kernel trick allows support vector machines to operate in infinite dimensional feature spaces while maintaining computational efficiency.

Uploaded by

Ulil Herdianto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views13 pages

SVM and Kernels

Uploaded by

Ulil Herdianto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Kernels and the Kernel Trick

Martin Hofmann

Reading Club "Support Vector Machines"

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

1 / 13

Optimization Problem

maximize:

W() =

m
X
i=1

m
1X
j j yi yj hxi xj i
2
i,j=1

subject to i 0, i = 1, . . . , m and

m
i=1

i yi = 0

data not linear separable in input space

map into some feature space where data is linear separable

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

2 / 13

Mapping Example
map data points into feature space with some function
e.g.:
: R2 R2

(x2 , x2 ) (z1 , z2 , z3 ) := (x12 , 2x1 x2 , x22 )

hyperplane hw zi = 0, as a function of x:

w1 x12 + w2 2x1 x2 + w3 x22 = 0

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

3 / 13

Kernel Trick
solve maximisation problem using mapped data points

W() =

m
X
i=1

m
1X
i
j j yi yj h(xi ) (xj )i
2
i,j=1

Dual Representation of Hyperplane ( primal Lagrangian):

f (x) = hw xi + b =

i yi hxi xi with w =

i yi xi

weight vector represented only by data points

only inner product of data points necessary, no coordinates
kernel function K(x1 , x2 ) = h(xi ) (xj )i

not necessary any more

possible to operate in any n-dimensional FS
complexity independent of FS

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

4 / 13

Example Kernel Trick

~x = (x1 , x2 )
~z = (z1 , z2 )
2

K(x, z) = h~x ~zi

K(x, z)

= h~x ~zi
= (x1 z1 + x2 z2 )2
= (x12 z21 + 2x1 z1 x2 z2 + x22 z22 )
E
D

=
(x12 , 2x1 x2 , x22 ) (z21 , 2z1 z2 , z22 )
= h(~x) (~z)i

mapping function fused in K

implicit (~x) = (x12 , 2x1 x2 , x22 )

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

5 / 13

Typical Kernels
Polynomial Kernel
d

K(x, z) = (hx zi + ) ,

for d 0

Radial Basis Function (Gaussian Kernel)

K(x, z) = e

kxzk2
2 2

kxk :=

hx xi

(Sigmoid Kernel)

K(x, z) = tanh( hx zi +
Inverse multi-quadric

K(x, z) = p

kx zk2 2 2 + c2

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

6 / 13

Typical Kernels Cont.

Kernels for Sets -

, 0
0

K s(, ) =

N N0
X
X

k(xi , xj0 )

i=1 j=1

where k(xi , xj0 ) is a kernel on elements in

, 0

Kernels for strings (Spectral Kernels) and trees

no one-fits-all kernel
model search and cross-validation in practice
low polynomial or RBF a good initial try

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

7 / 13

Kernel Properties

Symmetry

K(x, z) = h(x) (z)i = h(z) (x)i = K(z, x)

Cauchy-Schwarz Inequality
2

K(x, z)2 = h(x) (z)i k(x)k2 k(z)k2

= h(x) (x)i h(z) ( z)i
= K(x, x)K(z, z)

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

8 / 13

Making Kernels from Kernels

create complex Kernels by combining simpler ones

Closure Properties:

K(x, z)
K(x, z)
K(x, z)
K(x, z)
K(x, z)

=
=
=
=
=

c K1 (x, z)
c + K1 (x, z)
K1 (x, z) + K2 (x, z)
K1 (x, z) K2 (x, z)
f (x) f (z)

if K1 and K2 are kernels, f : X R, and c > 0

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

9 / 13

Gram Matrix

Kernel function as similarity measure between input objects

Gram Matrix (Similarity/Kernel Matrix) represents similarities between

input vectors
let V = ~v1 , . . . ,~vn a set of input vectors, then the Gram Matrix K is defined
as:

h(~v1 ) (~v1 )i . . . h(~v1 ) (~vn )i

..
h(~v2 ) (~v1 )i . . .

.
h(~vn ) (~v1 )i . . . h(~vn ) (~vn )i
K is symmetric and positive semis-definite (positive eigenvalues)

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

10 / 13

Mercers Theorem
assume:
finite input space X = {x1 , . . . , xn }
symmetric function K(x, z) on X
Gram Matrix K = (K(xi , xj ))ni,j=1
since K is symmetric there exists an orthogonal matrix V s.t. K = VV0
diagonal containing eigenvalues t of K
and eigenvectors vt = (vti )ni=1 as columns of V
all eigenvalues are non-negative and let feature mapping be
: xi 7

i vti

t=1

then

h(xi ) (xj )i =

n
X

Rn , i = 1, . . . , n.

t vti vtj = (VV0 )ij = Kij = K(xi , xj )

t=1

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

11 / 13

Mercers Theorem Cont.

every Gram Matrix is symmetric and positive semi-definite

every spsd matrix can be regarded as a Kernel Matrix, i.e. as an inner

product matrix in some space

diagonal matrix satisfies Mercers criteria, but not good as Gram Matrix
self-similarity dominates between-sample similarity
represents orthogonal samples
generalization for infinite input space
eigenvectors of the data in can be used to detect directions of maximum
variance
kernel principal components analysis

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

12 / 13

Summary

Kernel calculates dot product of mapped data points without mapping

function
represented by symmetric, positive semi-definite Gram Matrix
fuses information about data and kernel
standard kernels (cross validation)
every similarity matrix can be used as kernel (satisfying Mercers criteria)
ongoing research to estimate Kernel Matrix from available data

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

13 / 13

ML Kernel Methods
No ratings yet
ML Kernel Methods
51 pages
TD Processus MPCI
No ratings yet
TD Processus MPCI
11 pages
Improving Software Estimation Promises Using Monte Carlo Simulation
No ratings yet
Improving Software Estimation Promises Using Monte Carlo Simulation
9 pages
Operations Research Practice Problems
No ratings yet
Operations Research Practice Problems
1 page
Neural Networks in Finance
No ratings yet
Neural Networks in Finance
10 pages
Square Topology For NoCs
No ratings yet
Square Topology For NoCs
4 pages
TD1MDBS
100% (1)
TD1MDBS
3 pages
Scilab Tutorial: Phani Raj
No ratings yet
Scilab Tutorial: Phani Raj
11 pages
Martingales in Discrete-Time - (Kozdron)
No ratings yet
Martingales in Discrete-Time - (Kozdron)
5 pages
Haar Wavelets
No ratings yet
Haar Wavelets
4 pages
MAT1150-Exercices-preparatoires Corrigé
No ratings yet
MAT1150-Exercices-preparatoires Corrigé
6 pages
Exam 2
No ratings yet
Exam 2
2 pages
Question 01
No ratings yet
Question 01
113 pages
Parallel Numerical Solvers for Diffusion Equation
No ratings yet
Parallel Numerical Solvers for Diffusion Equation
33 pages
Convolution in 1D and 2D
No ratings yet
Convolution in 1D and 2D
18 pages
Bayesian Networks: Approximate Inference
No ratings yet
Bayesian Networks: Approximate Inference
37 pages
A Beginner's Guide To ETL With Python - by Jesús Cantú - Medium
No ratings yet
A Beginner's Guide To ETL With Python - by Jesús Cantú - Medium
13 pages
Dual Simplex Method Explained
No ratings yet
Dual Simplex Method Explained
14 pages
Lehmann-Scheffe Theorem
No ratings yet
Lehmann-Scheffe Theorem
15 pages
Final Exam Paper Fall 2020
No ratings yet
Final Exam Paper Fall 2020
3 pages
Branch and Price - Wikipedia
No ratings yet
Branch and Price - Wikipedia
3 pages
Application of Neural Network Models For Mathematical Programming Problems - A State of The Art Review
No ratings yet
Application of Neural Network Models For Mathematical Programming Problems - A State of The Art Review
12 pages
SVM Example
No ratings yet
SVM Example
10 pages
Markov Chains for Mathematicians
No ratings yet
Markov Chains for Mathematicians
59 pages
Advanced Financial Modeling
No ratings yet
Advanced Financial Modeling
60 pages
K-NN Algorithm: Key Concepts & Challenges
No ratings yet
K-NN Algorithm: Key Concepts & Challenges
10 pages
Orthogonal Matrices and Applications.: Presented by - G Venkata Jaya Krishna 228R1A0597
No ratings yet
Orthogonal Matrices and Applications.: Presented by - G Venkata Jaya Krishna 228R1A0597
13 pages
University of Tunis Fall 2013 Tunis Business School Decision & Game Theory Tutorial 3
No ratings yet
University of Tunis Fall 2013 Tunis Business School Decision & Game Theory Tutorial 3
4 pages
Ant Colony Optimization for SAR Path Planning
No ratings yet
Ant Colony Optimization for SAR Path Planning
11 pages
Exercices Modelisation
No ratings yet
Exercices Modelisation
3 pages
From Classical To Unsupervised Deep Learning For Solving Inverse Problem in Imaging To
No ratings yet
From Classical To Unsupervised Deep Learning For Solving Inverse Problem in Imaging To
248 pages
Goog Le Net
No ratings yet
Goog Le Net
30 pages
Ain Shams University Faculty of Engineering
No ratings yet
Ain Shams University Faculty of Engineering
2 pages
Bandits
No ratings yet
Bandits
2 pages
Support Vector Machines PDF
100% (1)
Support Vector Machines PDF
37 pages
An To An A That It It An: I. (L, The
No ratings yet
An To An A That It It An: I. (L, The
10 pages
Ecommerce Website Using Django.: Project in Python
No ratings yet
Ecommerce Website Using Django.: Project in Python
3 pages
Nonparametric Curve Estimation Course
No ratings yet
Nonparametric Curve Estimation Course
114 pages
1988 - Fronts Propagating With Curvature - Dependent Speed Algorithms Based On Hamilton-Jacobi Formulations - Osher, Sethian PDF
No ratings yet
1988 - Fronts Propagating With Curvature - Dependent Speed Algorithms Based On Hamilton-Jacobi Formulations - Osher, Sethian PDF
38 pages
Billingsley - Probability and Measure 2012-16-26
No ratings yet
Billingsley - Probability and Measure 2012-16-26
11 pages
TD Calcul Stochastique
No ratings yet
TD Calcul Stochastique
3 pages
TravellingSalesmanProblem PDF
No ratings yet
TravellingSalesmanProblem PDF
212 pages
Stochastic Calculus Exercises
100% (2)
Stochastic Calculus Exercises
85 pages
Markov 123
No ratings yet
Markov 123
108 pages
Math 55 3rd Exam Exercises PDF
No ratings yet
Math 55 3rd Exam Exercises PDF
3 pages
Existence and Exponential Decay of Solutions To A Quasilinear Thermoelastic Plate System - Irena Lasiecka
No ratings yet
Existence and Exponential Decay of Solutions To A Quasilinear Thermoelastic Plate System - Irena Lasiecka
27 pages
Lec20 RidgeRegression
No ratings yet
Lec20 RidgeRegression
21 pages
Econometrics Exam: OLS & R Code
No ratings yet
Econometrics Exam: OLS & R Code
3 pages
Distributed Databases: Solutions To Practice Exercises
No ratings yet
Distributed Databases: Solutions To Practice Exercises
4 pages
Formation À L'analyse de Séries Temporelles
No ratings yet
Formation À L'analyse de Séries Temporelles
146 pages
Introduction To Natural Language Processing Jacob Eisenstein Download
100% (1)
Introduction To Natural Language Processing Jacob Eisenstein Download
41 pages
Unsupervised Learning 2024-PPG
No ratings yet
Unsupervised Learning 2024-PPG
85 pages
2020 Notes Numprofin
No ratings yet
2020 Notes Numprofin
170 pages
Résolution Des Équations Non Linéaires
100% (1)
Résolution Des Équations Non Linéaires
4 pages
Unsupervised Learning: K-Means & GMM
No ratings yet
Unsupervised Learning: K-Means & GMM
27 pages
Examen Textmining 20202021
No ratings yet
Examen Textmining 20202021
2 pages
Lecture 8 - Kernels
No ratings yet
Lecture 8 - Kernels
32 pages
More Kernels and Their Properties
No ratings yet
More Kernels and Their Properties
3 pages
Kernel Functions
No ratings yet
Kernel Functions
35 pages
Kernel Method
No ratings yet
Kernel Method
5 pages
Eaton Metal Seals
No ratings yet
Eaton Metal Seals
60 pages
Lecture 8
No ratings yet
Lecture 8
16 pages
Webasto BlueCool Classic Operation, Settings, Troubleshooting
No ratings yet
Webasto BlueCool Classic Operation, Settings, Troubleshooting
54 pages
Topic 6. Other Laws
No ratings yet
Topic 6. Other Laws
15 pages
Traffic Monitoring System
No ratings yet
Traffic Monitoring System
16 pages
Oracle Analytic Functions Guide
100% (1)
Oracle Analytic Functions Guide
3 pages
Special Purpose Diodes Overview
No ratings yet
Special Purpose Diodes Overview
131 pages
Ericsson AXE 810: Switch (ROTD)
No ratings yet
Ericsson AXE 810: Switch (ROTD)
4 pages
Colleges Pune City 3
No ratings yet
Colleges Pune City 3
4 pages
Vectors, Tensors, and Curvilinear Coordinates: © 2003 by CRC Press LLC
No ratings yet
Vectors, Tensors, and Curvilinear Coordinates: © 2003 by CRC Press LLC
24 pages
Asynchronous Tasks With FastAPI and Celery
No ratings yet
Asynchronous Tasks With FastAPI and Celery
4 pages
Essar 32
No ratings yet
Essar 32
2 pages
Dimalibot Activity 5
No ratings yet
Dimalibot Activity 5
11 pages
Tamil Nadu Board Question Paper For Class 12 Physics 2015
No ratings yet
Tamil Nadu Board Question Paper For Class 12 Physics 2015
24 pages
Nanopositioning Technologies 2016
No ratings yet
Nanopositioning Technologies 2016
412 pages
UAV Pusher Configuration
No ratings yet
UAV Pusher Configuration
2 pages
On Chemical Eqiulibrium For G-11
No ratings yet
On Chemical Eqiulibrium For G-11
44 pages
Forecasting: Roles, Steps, Techniques
No ratings yet
Forecasting: Roles, Steps, Techniques
23 pages
Advance CSS Properties: Prepared By: Sonia Narang
No ratings yet
Advance CSS Properties: Prepared By: Sonia Narang
29 pages
The X3: Dealer Specification Guide From August 2019 Production
No ratings yet
The X3: Dealer Specification Guide From August 2019 Production
12 pages
Plumbing Safety & Workshop Guide
No ratings yet
Plumbing Safety & Workshop Guide
46 pages
Biologytest
No ratings yet
Biologytest
10 pages
Worksheet 8 Answers
No ratings yet
Worksheet 8 Answers
1 page
Sorghum Disease Detection with AI
No ratings yet
Sorghum Disease Detection with AI
29 pages
Sugiyono. (2016) - Metode Penelitian Pendidikan. Bandung:Alfabeta.p.116
No ratings yet
Sugiyono. (2016) - Metode Penelitian Pendidikan. Bandung:Alfabeta.p.116
9 pages
Chemistry Definitions by Usman Sir
No ratings yet
Chemistry Definitions by Usman Sir
2 pages
Preparing For Geometry
No ratings yet
Preparing For Geometry
21 pages
Hospital Management Software Development: Olawale Ayotunde Sobogungod
No ratings yet
Hospital Management Software Development: Olawale Ayotunde Sobogungod
3 pages
A I I E Transactions Volume 11 Issue 4 1979 (Doi 10.1080 - 05695557908974471) Muth, Eginhard J. White, John A. - Conveyor Theory - A Survey
100% (1)
A I I E Transactions Volume 11 Issue 4 1979 (Doi 10.1080 - 05695557908974471) Muth, Eginhard J. White, John A. - Conveyor Theory - A Survey
9 pages
Nuclear Physics Foundations
No ratings yet
Nuclear Physics Foundations
21 pages

SVM and Kernels

Uploaded by

SVM and Kernels

Uploaded by

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

data not linear separable in input space

map into some feature space where data is linear separable

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

(x2 , x2 ) (z1 , z2 , z3 ) := (x12 , 2x1 x2 , x22 )

w1 x12 + w2 2x1 x2 + w3 x22 = 0

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

Dual Representation of Hyperplane ( primal Lagrangian):

weight vector represented only by data points

not necessary any more

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

Example Kernel Trick

K(x, z) = h~x ~zi

mapping function fused in K

implicit (~x) = (x12 , 2x1 x2 , x22 )

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

Radial Basis Function (Gaussian Kernel)

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

Typical Kernels Cont.

where k(xi , xj0 ) is a kernel on elements in

Kernels for strings (Spectral Kernels) and trees

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

K(x, z) = h(x) (z)i = h(z) (x)i = K(z, x)

K(x, z)2 = h(x) (z)i k(x)k2 k(z)k2

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

Making Kernels from Kernels

create complex Kernels by combining simpler ones

if K1 and K2 are kernels, f : X R, and c > 0

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

Kernel function as similarity measure between input objects

h(~v1 ) (~v1 )i . . . h(~v1 ) (~vn )i

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

t vti vtj = (VV0 )ij = Kij = K(xi , xj )

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

Mercers Theorem Cont.

every Gram Matrix is symmetric and positive semi-definite

product matrix in some space

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

Kernel calculates dot product of mapped data points without mapping

Kernels and the Kernel Trick

Reading Club "Support Vector Machines"

You might also like