KEMBAR78
Aim of The Experiment-Software Required - Theory | PDF | Support Vector Machine | Applied Mathematics
0% found this document useful (0 votes)
35 views6 pages

Aim of The Experiment-Software Required - Theory

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views6 pages

Aim of The Experiment-Software Required - Theory

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Experiment-

Aim of the Experiment- To impliment the Support vector machine algorithm in MATLAB.
Software Required- MATLAB
Theory-
SVM is a powerful supervised algorithm that works best on smaller datasets but on
complex ones. Support Vector Machine, abbreviated as SVM can be used for both
regression and classification tasks, but generally, they work best in classification problems.
During the 1990s, these algorithms gained widespread popularity and have remained a
popular choice for developing high-performing models with some fine-tuning. Both SVM
and logistic regression attempt to find the optimal hyperplane, but the main difference is
that logistic regression uses a probabilistic approach, while support vector machines are
based on statistical approaches. The objective of the SVM algorithm is to locate a
hyperplane in an N-dimensional space that effectively separates the data points into distinct
classes. The dimension of the hyperplane is determined by the number of features involved.
If there are only two input features, the hyperplane is simply a line. For three input features,
the hyperplane becomes a 2-D plane. As the number of features increases beyond three, it
becomes difficult to visualize. Consider two independent variables, x1 and x2, and one
dependent variable that is either a blue circle or a red circle. There are multiple lines (our
hyperplane is a line since we are only considering two input features, x1 and x2) that
separate our data points or classify them as red or blue circles.
Working Principle of SVM-
A reasonable choice for the best hyperplane is the one that is able to represent the
largest separation or margin between the two classes. The hyperplane that maximizes the
distance from it to the nearest data point on each side is chosen. If a hyperplane exists that
satisfies this condition, it is referred to as the maximum-margin hyperplane or hard margin.
In the figure provided, L2 is selected as the hyperplane. Now, let us consider a scenario as
illustrated below.

36
In this scenario, there is a blue ball located on the boundary of the red balls. The way in
which SVM classifies this data is straightforward. The blue ball situated on the boundary of
the red balls is considered to be an outlier amongst the blue balls. The SVM algorithm
possesses the ability to disregard outliers and determine the hyperplane that optimizes the
margin. This makes SVM highly resilient to outliers.

When dealing with this type of data point, SVM calculates the maximum margin as it
does with previous datasets. However, in addition to this, it incorporates a penalty each
time a point crosses the margin. In these types of cases, the margins are referred to as soft
margins. When a soft margin is present within the dataset, SVM attempts to minimize the
value of (1/margin+∧(∑penalty)). Hinge loss is a frequently employed penalty in this context.
If no violations occur, no hinge loss is incurred. Conversely, if violations do occur, the hinge
loss is proportional to the distance of the violation.

Suppose our data is illustrated in the figure presented above. To address this, SVM
generates a new variable utilizing a kernel. We label a point xi located on the line and
produce a new variable yi that is a function of the distance from the origin, o. If we graph
this data, it appears as depicted below.

37
In this case, the new variable y is created as a function of distance from the origin. A non-
linear function that creates a new variable is referred to as a kernel.
Advantages of SVM-

• Effective in high-dimensional cases.


• Its memory is efficient as it uses a subset of training points in the decision function
called support vectors.
• Different kernel functions can be specified for the decision functions and its possible
to specify custom kernels.

Coding-
clear; close all; clc;

%% preparing dataset

load fisheriris

species_num = grp2idx(species);
%%

% binary classification
X = randn(100,10);
X(:,[1,3,5,7]) = meas(1:100,:); % 1, 3, 5, 7 feature selection
y = species_num(1:100);

rand_num = randperm(size(X,1));
X_train = X(rand_num(1:round(0.8*length(rand_num))),:);
y_train = y(rand_num(1:round(0.8*length(rand_num))),:);

X_test = X(rand_num(round(0.8*length(rand_num))+1:end),:);
y_test = y(rand_num(round(0.8*length(rand_num))+1:end),:);
%% CV partition

c = cvpartition(y_train,'k',5);
%% feature selection

opts = statset('display','iter');
38
classf = @(train_data, train_labels, test_data, test_labels)...
sum(predict(fitcsvm(train_data, train_labels,'KernelFunction','rbf'),
test_data) ~= test_labels);

[fs, history] = sequentialfs(classf, X_train, y_train, 'cv', c, 'options',


opts,'nfeatures',2);
%% Best hyperparameter

X_train_w_best_feature = X_train(:,fs);

Md1 =
fitcsvm(X_train_w_best_feature,y_train,'KernelFunction','rbf','OptimizeHyperparame
ters','auto',...
'HyperparameterOptimizationOptions',struct('AcquisitionFunctionName',...
'expected-improvement-plus','ShowPlots',true)); % Bayes' Optimization

%% Final test with test set


X_test_w_best_feature = X_test(:,fs);
test_accuracy_for_iter = sum((predict(Md1,X_test_w_best_feature) ==
y_test))/length(y_test)*100

%% hyperplane
figure;
hgscatter =
gscatter(X_train_w_best_feature(:,1),X_train_w_best_feature(:,2),y_train);
hold on;
h_sv=plot(Md1.SupportVectors(:,1),Md1.SupportVectors(:,2),'ko','markersize',8);

% test set data

gscatter(X_test_w_best_feature(:,1),X_test_w_best_feature(:,2),y_test,'rb','xx')

% decision plane
XLIMs = get(gca,'xlim');
YLIMs = get(gca,'ylim');
[xi,yi] = meshgrid([XLIMs(1):0.01:XLIMs(2)],[YLIMs(1):0.01:YLIMs(2)]);
dd = [xi(:), yi(:)];
pred_mesh = predict(Md1, dd);
redcolor = [1, 0.8, 0.8];
bluecolor = [0.8, 0.8, 1];
pos = find(pred_mesh == 1);
h1 = plot(dd(pos,1),
dd(pos,2),'s','color',redcolor,'Markersize',5,'MarkerEdgeColor',redcolor,'MarkerFa
ceColor',redcolor);
pos = find(pred_mesh == 2);
h2 = plot(dd(pos,1),
dd(pos,2),'s','color',bluecolor,'Markersize',5,'MarkerEdgeColor',bluecolor,'Marker
FaceColor',bluecolor);
uistack(h1,'bottom');
uistack(h2,'bottom');
legend([hgscatter;h_sv],{'setosa','versicolor','support vectors'})

39
Output-

40
Conclusion- In conclusion, the support vector machine (SVM) is a powerful machine
learning algorithm that has proven to be effective in a variety of classification and regression
tasks. Through our experimentation, we have demonstrated that SVMs are capable of
achieving high levels of accuracy, even when working with complex and high-dimensional
datasets. Our results indicate that SVMs are particularly effective when dealing with non-
linearly separable data, as they are able to use kernel functions to transform the data into a
higher-dimensional space where linear separation is possible. We have also observed that
SVMs are less prone to overfitting than other machine learning algorithms, making them a
robust choice for real-world applications.One potential limitation of SVMs is that they can
be computationally expensive, particularly when working with large datasets. However, this
can be mitigated through the use of optimization techniques such as stochastic gradient
descent.

Submitted by- Sourav Pal


Regd No.- 2205070004
Course: M.Tech
Semester- 2nd
Branch: ETC (CSE)

41

You might also like