KEMBAR78
SVM Everything | PDF | Support Vector Machine | Statistical Classification
0% found this document useful (0 votes)
26 views5 pages

SVM Everything

This document is a comprehensive guide on Support Vector Machines (SVM), detailing its definition, key concepts, objectives, and applications in machine learning. It covers the differences between hard and soft margin SVMs, the kernel trick for non-linear data, and provides insights on parameter tuning and evaluation metrics. The guide also compares SVM with logistic regression and decision trees, highlighting when to use SVM and its advantages and disadvantages.

Uploaded by

jaysawant0055
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views5 pages

SVM Everything

This document is a comprehensive guide on Support Vector Machines (SVM), detailing its definition, key concepts, objectives, and applications in machine learning. It covers the differences between hard and soft margin SVMs, the kernel trick for non-linear data, and provides insights on parameter tuning and evaluation metrics. The guide also compares SVM with logistic regression and decision trees, highlighting when to use SVM and its advantages and disadvantages.

Uploaded by

jaysawant0055
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Here is a complete and in-depth guide on Support Vector Machines

(SVM) tailored for your PG-DBDA Practical Machine Learning module. These
notes cover everything from basic concepts to exam-ready technical details, with
highlights for MCQs, theory, and lab viva questions.

🧠 SUPPORT VECTOR MACHINE (SVM) – COMPLETE


NOTES

✅ 1. What is SVM?
 Supervised learning algorithm used for both classification and
regression.
 Finds the best separating hyperplane that maximizes the margin
between classes.
 Effective in high-dimensional spaces and non-linear classification
(with kernel trick).

✅ 2. Key Concepts
Term Meaning

Hyperplane Decision boundary that separates classes

Distance between hyperplane and nearest points (support


Margin
vectors)

Data points closest to hyperplane – they influence the


Support Vectors
position/orientation

Maximum
SVM chooses the hyperplane with the largest margin
Margin

✅ 3. SVM Objective (Hard Margin)


For linearly separable data:
Minimize 12∥w∥2subject to yi(w⋅xi+b)≥1\text{Minimize } \frac{1}{2} \|w\|^2 \
quad \text{subject to } y_i(w \cdot x_i + b) \geq 1
Where:
 ww: weight vector
 bb: bias
 xix_i: data points
 yi∈{−1,+1}y_i \in \{-1, +1\}: class labels

✅ 4. Soft Margin SVM (C parameter)


Used when data is not perfectly separable.
Minimize 12∥w∥2+C∑ξi\text{Minimize } \frac{1}{2} \|w\|^2 + C \sum \xi_i
 ξi\xi_i: slack variable (margin violation)
 C: regularization parameter
 Large C → low bias, high variance (overfitting)
 Small C → high bias, low variance (underfitting)
🧠 MCQ Tip: C is a penalty for misclassification.

✅ 5. Kernel Trick
SVM uses kernels to handle non-linear data by mapping to higher dimensions.
Common Kernels:
Kernel Formula / Notes Use When…

Data is linearly
Linear K(x,x′)=x⋅x′K(x, x') = x \cdot x'
separable

K(x,x′)=(x⋅x′+c)dK(x, x') = (x \cdot x' + Feature interactions


Polynomial
c)^d important

RBF K(x,x′)=exp⁡(−γ∥x−x′∥2)K(x, x') = \exp(-\


Most commonly used
(Gaussian) gamma \|x - x'\|^2)

Rare, inspired by neural


Sigmoid tanh⁡(αx⋅x′+c)\tanh(\alpha x \cdot x' + c)
nets

 γ (gamma): Controls shape of decision boundary in RBF


 High γ → overfitting (tight boundaries)
 Low γ → underfitting (loose boundaries)

✅ 6. SVM for Regression (SVR)


 Predicts a continuous value within an ε-margin of tolerance.
∣yi−f(xi)∣≤ϵ|y_i - f(x_i)| \leq \epsilon
 Only points outside ε-tube are considered in loss function.
✅ 7. Advantages
✅ Effective in high-dimensional spaces
✅ Works well with clear margin of separation
✅ Memory efficient (uses only support vectors)
✅ Versatile (supports kernels)

✅ 8. Disadvantages
❌ Not suitable for very large datasets (slow training)
❌ Requires feature scaling
❌ Choosing right kernel and parameters is tricky
❌ Poor performance with noisy data and overlapping classes

✅ 9. SVM vs Logistic Regression


Feature SVM Logistic Regression

Type Maximum margin classifier Probabilistic classifier

Output Class label only Class probability

Works on linear data ✅ Yes ✅ Yes

Works on non-linear data ✅ With kernel ❌ (needs transformation)

Feature scaling required ✅ Yes ✅ Yes

Handles outliers ❌ Sensitive ✅ Moderately robust

✅ 10. Feature Scaling Required


 SVM is distance-based → scale features using StandardScaler or
MinMaxScaler
 Especially needed for RBF kernel

✅ 11. Parameter Tuning


Parameter Role

C Regularization strength

γ Kernel coefficient (RBF)

kernel Type of kernel (linear, rbf)

Use GridSearchCV or RandomizedSearchCV for tuning.


✅ 12. Scikit-learn SVM Example
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

model = SVC(kernel='rbf', C=1.0, gamma='scale')


model.fit(X_scaled, y)
predictions = model.predict(X_test_scaled)

✅ 13. SVM Works Best When


 Number of features > number of samples (text/NLP, genomics)
 Clear margin between classes
 Need high accuracy in binary classification

✅ 14. SVM Struggles When


 Dataset is large (slow training)
 Data is noisy or overlapping
 Output probabilities are needed
 Many irrelevant features (feature selection needed)

✅ 15. SVM Evaluation Metrics


 Classification: Accuracy, Precision, Recall, F1, ROC-AUC
 Regression (SVR): MSE, RMSE, MAE, R²

✅ 16. Common MCQ Points


 Support vectors are data points that influence hyperplane
 C controls regularization
 Kernel trick avoids computing high-dimensional features explicitly
 RBF kernel is most widely used
 γ is used only in RBF/Poly kernels, not linear
 Needs scaling (unlike Decision Trees)
✅ 17. SVM vs Decision Tree
Feature SVM Decision Tree

Non-linear data ✅ With kernel ✅ Yes

Feature scaling ✅ Needed ❌ Not needed

Interpretability ❌ Low ✅ High

Overfitting risk ✅ Medium ✅ High (if unpruned)

Handles outliers ❌ Poor ✅ Good

✅ 18. Applications of SVM


 Handwritten digit recognition (MNIST)
 Text classification (Spam vs Ham)
 Bioinformatics (gene classification)
 Image classification
 Fraud detection

📝 Summary: When to Use SVM


Scenario Use SVM?

Binary classification ✅ Yes

Non-linear separation ✅ Yes (with kernel)

Text classification (sparse) ✅ Yes

Large dataset ❌ No (training slow)

Need probabilities ❌ Use Logistic

Interpretability needed ❌ Poor

Would you like me to:


 📘 Export these notes as a PDF?
 🧠 Create MCQs or a quiz on SVM?
 📊 Prepare visual diagrams of kernel trick and margin?
Let me know how you'd like to revise next!

You might also like