Pattern Recog - Manual
Pattern Recog - Manual
Name :
Register No. :
Day / Session :
Venue :
Title of Experiment :
Date of Conduction :
Date of Submission :
REPORT VERIFICATION
Date :
Staff Name :
Signature :
1. DIGITIZATION OF ANALOG SIGNAL
1.1 Objective
To convert the anlaog signal into digital signal
1.2 Tasks
I. Generate a continuous time signal
III. Convert the discrete time signal to the digital signal using rounding and truncation process of
quantization.
1.3 Theory
A signal is defined as any physical quantity that varies with time, space, or any other
independent variable or variables. Digitizaation of the signal is shown in the following figure.
Sampler converts analog signal in to discrete time signal. After sampling the
amplitude values are infinite between the two limits. The quantizer is used to map the infinite
amplitude values onto a finite set of known values. Encoder converts quantized value into
digital signal (sequence of 0 and 1).
Sampling
In sampling, analog signal is sampled every TS secs. Ts is referred to as the sampling interval.
fs = 1/Ts is called the sampling rate or sampling frequency.
Sampling Theorem According to the Nyquist theorem, the sampling rate must be at least 2
times the highest frequency contained in the signal. F S ≥2 F max
Quantizer Sampling results in a series of pulses of varying amplitude values ranging between
two limits: a min and a max. The amplitude values are infinite between the two limits. We
need to map the infinite amplitude values onto a finite set of known values.
Assume we have a voltage signal with amplitudes V min=-20V and Vmax=+20V. We want to
use 3 bits(b), so number of levels=L=23=¿ 8 quantization levels. Convert the discretized
value (d) as quantized type value(q)
V max−V min
q 1= (L means 0 to L-1)
L−1
d−V min
q=
q1
(i) Rounding
(ii) Truncation
Rounding: Convert the value of q into nearest integer(n q ¿ by rounding
1.4 Algorithm
Step-1:
Step-2:
1.5 Programme
// Experiment number-1
clc ;
close ;
clear ;
subplot(2,1,1);
plot (t, y );
xlabel ( "Time" ) ;
ylabel ( "Amplitude" ) ;
//Part-2 --Sampling
// x(n)=x(Ts . n)
Ts=1/Fs;
t1=0:Ts:0.5;
n=1:length(t1);
subplot(2,1,2);
plot2d3 ( y1 );
ylabel ( "Amplitude" ) ;
// Part-3 ---Quantization
// Quantization
quant1=(Amax-Amin)/(L-1);
quant=(y1-Amin)/quant1;
// rounding
y2=round(quant);
y3=floor(quant);
out3=[out1;out2];
1.6 Results
Pre-lab Answers
1.
2.
3.
4.
2. Sampled the generated cosine wave with sampling frequency 600 Hz. Plot the sampled
signal.
3. Sampled the generated cosine wave with sampling frequency 300 Hz. Plot the sampled
signal. Find out the difference between plot and 2 and this question.
4. Generate the binary code using the rounding operation for number bits=4. What is the
binary value at sample number 10?
5. Generate the binary code using the truncation operation for number bits=4. What is the
binary value at sample number 10?
Post-lab Answers
1.
2.
3.
4.
5.
1.10 Conclusion
2. PROGRAM TO COUNT THE WHITE PIXELS FROM THE IMAGE
2.1 Objective
Program to count the white pixels from the image
2.2 Tasks
I. Read the image file.
III. Count the number of white pixel and black pixels from image.
2.3 Theory
Basic about Pixels of image
An image consists of a rectangular array of dots called pixels. The size of the image is
usually specified as width X height, in numbers of pixels. The physical size of the image, in
inches or centimeters or whatever, depends on the resolution of the device on which the
image is displayed. Resolution is usually measured in terms of DPI, which stands for dots
per inch. An image will appear smaller (and generally sharper) on a device with a higher
resolution than on one with a lower resolution.
For white pixel, pixel value=255. For black pixels the pixel value=0; There are 3 methods can
be use to detect White pixels
Method-1: Calculate the channels
Then calculate how many have all 3 channels have. value of 255 for counting number of
white pixel
Method-2:
Covert the image in to Gray scale image. b11=rgb2gray(A) ---- This code is used to convert
rgb to gray. After that simply count number of pixels where it is 255 for white pixel, and 0
for black pixels
Method-3:
Convert Gray scale image into binary image. From the image if value is 1, that is white, If
value is zero, that is black. a=im2double(b11) ----- it is used to convert gray scale image to
binary
2.4 Algorithm
Step-1:
Step-2:
2.5 Programme
//Experiment number-2 //Program to count the white pixels from the image
clc
clear all;
//atomsInstall("IPCV")
x=imread("cameraman.jpg");
figure; imshow(x)
// 3 channels
count = sum(whitePixels(:));
// Method-2 b11=rgb2gray(x);
figure;
imshow(b11)
blackPixels=b11==0
count1 = sum(whitePixels(:));
count2= sum(blackPixels(:)); d
//Method-3
a=im2double(b11);
// im2double rescales the output from integer data types to the range [0, 1].
2.6 Results
Pre-lab Answers
1.
2.
3.
4.
3.Q. Convert the image into Gray scale and display it.
4.Q. use all the 3 methods to calculate the number of white pixels and black pixels. Display
it.
Post-lab Answers
1.
2.
3.
4.
2.10 Conclusion
3. ANALYSIS OF DATA SET WITH CLASSIFIERS
3.1 Objective
To analysis of data set with classifiers
3.2 Tasks
I. Create datasets.
II. Create datasets and assign label to datasets
III. Compute conditional probability density function using Gaussian probabilistic density
function
IV. Compute posterior probability
V. Design Bayes classifier
VI. Computation of accuracy and classification error using Bayes classifier
3.3 Theory
Some basic notations
Bayes Classifier
f i ( X ) . Pi
Bayes theorem states that: q i ( X )= Where Z=f 0 ( X ) p0 +¿ f 1 ( X ) p1 is the normalizing
Z
constant.
Consider the classifier given by
{
q0 ( X )
0 >1
h ( X )= q1 ( X)
1 otherwise
This is called the binary Bayes classifier.
Bayes classifier optimal in the sense that minimizes the probability error.
3.5 Programme
Programme-1: Create datasets
clc;
//
//% Generate the first dataset (case #1)
rand('seed',0);
m=[0 0]';
S=[1 0;0 1];
N=500;
//
//% Generate and plot the fifth dataset (case #5)
m=[0 0]';
S=[2 0;0 0.2];
N=500;
X= grand(N, "mn", m, S)
figure(5),
plot(X(1,:),X(2,:),'.');
rand('seed',0);
// Generate the dataset X1 as well as the vector containing the class labels of
// the points in X1
N=[100 100]; // 100 vectors per class
l=2; // Dimensionality of the input space
x=[3 3]';
//x=[2 2]'; //for X2
// x=[0 2]'; for X3
// x=[1 1]'; for X4
X1=[2*rand(l,N(1)) 2*rand(l,N(2))+x*ones(1,N(2))];
X1=[X1; ones(1,sum(N))];
y1=[-ones(1,N(1)) ones(1,N(2))];
// 1. Plot X1, where points of different classes are denoted by different colors,
figure(1),
plot(X1(1,y1==1),X1(2,y1==1),'bo',X1(1,y1==-1),X1(2,y1==-1),'r.')
function [z]=comp_gauss_dens_val(m, S, x)
//% FUNCTION
//% [z]=comp_gauss_dens_val(m,S,x)
//% Computes the value of a Gaussian distribution, N(m,S), at a specific point
//%
//% INPUT ARGUMENTS:
//% m: l-dimensional column vector corresponding to the mean vector of the
//% gaussian distribution.
//% S: lxl matrix that corresponds to the covariance matrix of the
//% gaussian distribution.
//% x: l-dimensional column vector where the value of the gaussian
//% distribution will be evaluated.
//%
//% OUTPUT ARGUMENTS:
//% z: the value of the gaussian distribution at x.
//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[l,c]=size(m);
z=(1/( (2*%pi)^(l/2)*det(S)^0.5) )*exp(-0.5*(x-m)'*inv(S)*(x-m));
end
//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
// FUNCTION
// [z]=bayes_classifier(m,S,P,X)
// Bayesian classification rule for c classes, modeled by Gaussian
//% distributions (also used in Chapter 2).
//%
//% INPUT ARGUMENTS:
//% m: lxc matrix, whose j-th column is the mean of the j-th class.
//% S: lxlxc matrix, where S(:,:,j) corresponds to
//% the covariance matrix of the normal distribution of the j-th
//% class.
//% P: c-dimensional vector, whose j-th component is the a priori
//% probability of the j-th class.
//% X: lxN matrix, whose columns are the data vectors to be
//% classified.
//%
//% OUTPUT ARGUMENTS:
//% z: N-dimensional vector, whose i-th element is the label
//% of the class where the i-th data vector is classified.
[l,c]=size(m);
[l,N]=size(X);
for i=1:N
for j=1:c
t(j)=P(j)*comp_gauss_dens_val(m(:,j),S(:,:,j),X(:,i));
end
[num,z(i)]=max(t);
end
end
x=[3 3]';
//x=[2 2]'; //for X2
// x=[0 2]'; for X3
// x=[1 1]'; for X4
X1=[2*rand(l,N(1)) 2*rand(l,N(2))+x*ones(1,N(2))];
X1=[X1; ones(1,sum(N))];
y1=[-ones(1,N(1)) ones(1,N(2))];
// 1. Plot X1, where points of different classes are denoted by different colors,
figure(1),
plot(X1(1,y1==1),X1(2,y1==1),'bo',X1(1,y1==-1),X1(2,y1==-1),'r.')
m=[0 0 0; 1 2 2; 3 3 4]';
S1=0.8*eye(3);
S(:,:,1)=S1;S(:,:,2)=S1;S(:,:,3)=S1;
P=[1/3 1/3 1/3]';
z_bayesian=bayes_classifier(m,S,P,X1);
for i=1:length(z_bayesian)
if (z_bayesian(i)==3)
z11(i)=1;
else
z11(i)=-1;
end
end
3.6 Results
2.
3.
2.
3.
4.
3.10 Conclusion
4. PROGRAMS ON ESTIMATION
4.1 Objective
To estimate of parameters for normal distribution function of
4.2 Tasks
I. Estimation of parameters using maximum likelihood method
4.3 Theory
Parametric Method
One of the most straight forward approaches to estimate conditional probability density p(x )
in terms of a specific functional form which contains number of adjustable parameters. The
values of the parameters can then be optimized to give best fit to the data. The simplest and
most widely used parametric model is the normal or Gaussian distribution, which has a
number of convenient and static properties.
The normal density function for the case of a single variable can be written in the form
{ }
2
1 −(x −μ)
p ( x )= 1
exp
( 2σ 2 )
(2π σ )2 2
(1)
Where, μ= mean σ 2=variance Square root of variance is called the standard deviation.
∫ p ( x ) dx=1
−∞
The mean and variance of the one-dimensional distribution satisfy
∞
μ=E [ x ] =¿ ∫ x p ( x ) dx
−∞
∞
σ =E [ ( x−μ ) ]= ∫ ( x−μ ) p ( x ) dx
2 2 2
−∞
p ( X )=
1
d
( 2 π ) |S|
2
1
2
exp {−12 ( x−μ) S
T −1
( x−μ ) } (2)
∫ p ( X ) dx=1
−∞
μ=E [ x ]
σ =E [ ( x−μ ) ( x−μ ) ]
2 T
2 T −1
∆ =( x−μ ) S ( x−μ ) is called the Mahalanobis distance from x to μ.
2
(S)i , j=δ i , j σ j
d
p ( X )=∏ p(x i )
i=1
Further simplification can be obtained by choosing
σ j=σ
For simplicity, we are using normal density function as probability of conditional density
function. Suppose we consider a conditional density function p ( X ) which depends on set of
T
parameters θ=( θ1 , … … … … … … . ,θ M ) . In a classification problem we would take one such
function for each of the class.Here, we shall omit the class labels for simplicity, but
essentially the same steps are performed separately for each class in the problem. To make
the dependence on the parameters explicit, the density function is written in the form
p( X∨θ). We also have dataset of N vectors. χ ={ X 1 , … … … .. , X N }. If these vectors are
drawn independently from the distribution p( X∨θ), then the joint probability density of the
N
whole dataset χ is given by p ¿|θ ¿=∏ p( X ∨θ)= L(θ)
n
n=1
Where, L(θ) can be viewed as a function of θ for fixed χ ,in which case it is referred as the
likelihood of θ for the given χ . The technique of maximum likelihood then sets the value of
θ by maximizing L(θ). The idea of choosing θ which is most likely to give rise to the
observed data.
E=−ln ¿ ¿ (3)
Some straight forward but rather involved matrix algebara then leads to the following results.
N
1
^μ= ∑ Xn
N n=1
(4)
N
^S= 1 ∑ (X n−μ)(X n−μ)T (5)
N n=1
Which represents the maximum likelihood estimate ^μ of the mean vector μ is given by the
sample average (i.e. the average with respect to the given dataset) shows in equation-5.
Similarly, the maximum likelihood estimate ^S of the covariance matrix S is given by the
equation-(5).
Here prior probability is assumed as explained in the process of likelihood the same method
is continued, to evaluate the posterior probability using Bayes theorem. The posterior is
developed by generating the Gaussian distribution with the help of parameter. The best suited
parameter gives the better result is consider as estimated parameter by using maximum
posterior estimation.
4.4 Algorithm
Step-1:
Step-2:
4.5 Programme
Programme-1: Estimation of parameters using Maximum likelihood estimation
[l,N]=size(X);
m_hat=(1/N)*sum(X')';
S_hat=zeros(l);
for k=1:N
S_hat=S_hat+(X(:,k)-m_hat)*(X(:,k)-m_hat)';
end
S_hat=(1/N)*S_hat;
end
clc;
// Generate dataset X
rand('seed',0);
m = [2 -2]'; //mean
S= [0.9 0.2; 0.2 .3]; // covariance
X=grand(50, "mn", m,S);
figure;
plot(X(1,:), X(2,:),'.r');
title ("Dataset")
// Compute the ML estimates of m and S
[m_hat_1, S_hat_1]=Gaussian_ML_estimate(X);
disp( "Estimate mean by maximum likelihood estimation", m_hat_1);
disp( "Estimate mean by maximum likelihood estimation",S_hat_1);
//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[l,c]=size(m);
X=[];
y=[];
for j=1:c
//% Generating the [p(j)*N] vectors from each distribution
// t=mvnrnd(m(:,j),S(:,:,j),fix(P(j)*N))';
t=grand(N,"mn",m(:,j),S(:,:,j))';
// % The total number of data vectors may be slightly less than N due to
// % the fix operator
X=[X t];
y=[y ones(1,fix(P(j)*N))*j];
end
end
clc;
//To generate X, utilize the function generate_gauss_classes
//m=[0 0 0; 1 2 2; 3 3 4]';
m=[1 3 4];
S1=0.8*eye(3);
S(:,:,1)=S1;S(:,:,2)=S1;S(:,:,3)=S1;
P=[1/3 1/3 1/3]';
N=1000;
rand('seed',0);
[X,y]=generate_gauss_classes(m,S,P,N);
disp( "Estimate mean by maximum likelihood estimation for Gaussian class", m_hat);
function [z]=gauss(x, m, s)
//
//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
//% FUNCTION (auxiliary)
//% [z]=gauss(x,m,s)
//% Takes as input the mean values and the variances of a number of Gaussian
//% distributions and a vector x and computes the value of each
//% Gaussian at x.
//%
//% NOTE: It is assumed that the covariance matrices of the gaussian
//% distributions are diagonal with equal diagonal elements, i.e. it has the
//% form sigma^2*I, where I is the identity matrix.
//%
//% INPUT ARGUMENTS:
//% x: l-dimensional row vector, on which the values of the J
//% gaussian distributions will be calculated
//% m: Jxl matrix, whose j-th row corresponds to the
//% mean of the j-th gaussian distribution
//% s: J-dimensional row vector whose j-th component corresponds to
//% the variance for the j-th gaussian distribution (it is assumed
//% that the covariance matrices of the distributions are of the
//% form sigma^2*I, where I is the lxl identity matrix)
//%
//% OUTPUT ARGUMENTS:
//% z: J-dimensional vector whose j-th component is the value of the
//% j-th gaussian distribution at x.
//%
//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[J,l]=size(m);
[p,l]=size(x);
z=[];
for j=1:J
t=(x-m(j,:))*(x-m(j,:))';
c=1/(2*%pi*s(j))^(l/2);
z=[z c*exp(-t/(2*s(j)))];
end
end
//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
//% FUNCTION
//% [m,s,Pa,iter,Q_tot,e_tot]=em_alg_function(x,m,s,Pa,e_min)
//% EM algorithm for estimating the parameters of a mixture of normal
//% distributions, with diagonal covariance matrices.
//% WARNING: IT ONLY RUNS FOR THE CASE WHERE THE COVARIANCE MATRICES
//% ARE OF THE FORM sigma^2*I. IN ADDITION, IF sigma_i^2=0 FOR SOME
//% DISTRIBUTION AT AN ITERATION, IT IS ARBITRARILY SET EQUAL TO 0.001.
//%
//% INPUT ARGUMENTS:
//% x: lxN matrix, each column of which is a feature vector.
//% m: lxJ matrix, whos j-th column is the initial
//% estimate for the mean of the j-th distribution.
//% s: 1xJ vector, whose j-th element is the variance
//% for the j-th distribution.
//% Pa: J-dimensional vector, whose j-th element is the initial
//% estimate of the a priori probability of the j-th distribution.
//% e_min: threshold used in the termination condition of the EM
//% algorithm.
//%
//% OUTPUT ARGUMENTS:
//% m: it has the same structure with input argument m and contains
//% the final estimates of the means of the normal distributions.
//% s: it has the same structure with input argument s and contains
//% the final estimates of the variances of the normal
//% distributions.
//% Pa: J-dimensional vector, whose j-th element is the final estimate
//% of the a priori probability of the j-th distribution.
//% iter: the number of iterations required for the convergence of the
//% EM algorithm.
//% Q_tot: vector containing the likelihood value at each iteration.
//% e_tot: vector containing the error value at each itertion.
//%
//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
x=x';
m=m';
[p,n]=size(x);
[J,n]=size(m);
e=e_min+1;
Q_tot=[];
e_tot=[];
iter=0;
//while (e>e_min)
while (iter<20)
iter=iter+1;
e;
P_old=Pa;
m_old=m;
s_old=s;
P_tot=temp*Pa';
for j=1:J
P(j,k)=temp(j)*Pa(j)/P_tot;
end
end
la=(sum(P(j,:)));
m(j,1)=a(n)/la;
end
if(s(j)<10^(-10))
s(j)=0.001;
end
end
end
// % Determine the a priori probabilities
for j=1:J
a=0;
for k=1:p
a=a+P(j,k);
end
Pa(j)=a/p;
end
ru1=isnan(m);
m(ru1)=0;
ru1=isnan(s);
s(ru1)=0;
e=sum(abs(Pa-P_old))+sum(sum(abs(m-m_old)))+sum(abs(s-s_old));
e_tot=[e_tot e];
end
clc;
// Generate dataset X
rand('seed',0);
m = [2 -2]'; //mean
S= [0.2 0; 0.2]; // covariance
X=grand(50, "mn", m,S);
m1_ini=[ 2];m2_ini=[-1]
m_ini=[m1_ini m2_ini];
s_ini=[.1 0; 0 0.1];
Pa_ini=[1/2 1/2 ];
e_min=10^(-5);
[m_hat,s_hat,Pa,iter,Q_tot,e_tot]=em_alg_function(X,m_ini,s_ini,Pa_ini,e_min);
disp("Estimate mean by posterior likelihood estimation",m_hat);
disp("Estimate covariance by posterior likelihood estimation",s_hat);
4.6 Results
6. Write the formula for estimation of mean and covariance in maximum likelihood method
for one dimensional data.
Pre-lab Answers
1.
2.
3.
5. Generate a dataset with normalized distribution function having mean is [−11 ] and
covariance [ 0.8
0.2 0.4 ]
0.2
7. Generate a dataset with normalized distribution function having mean is [ 23] and
covariance [ 0.60 0.60 ]
8. Estimate mean and covariance by using maximum posterior estimation of dataset
generated by post-lab question-(4)
Post-lab Answers
1.
2.
3.
4.
4.10 Conclusion
5. LOADING A DATA SET AND SELECTING PREDICTIVE FEATURES
5.1 Objective
To load a data set and select predictive features
5.2 Tasks
I. Load the datasets
II. Interpret the datasets
III. Extract the general features from the dataset
IV. Histogram plot of dataset
V. Estimation of best feature selection using correlation method
5.3 Theory
Pattern recognition is formally defined as the process whereby a received pattern/signal is
assigned to one of a prescribed number of classes. If the grouping can be identified by the
particular class is known as supervised classification. Otherwise the classification is known
as unsupervised classification.
Features depend on the problem. –measure ‘relevant’ quantities. Some techniques available
to extract ‘more relevant’ quantities from the initial measurements (e.g. PCA(principal
component analysis)). After feature extraction each pattern is a vector. Classifier is a function
to map such vector into class labels. Many general techniques of classifiers design are
available. Need to test and validate the final system.
Use of all the features may not provide good performance. For example, if you read the
whole content of the test book for test, you may not score very good mark, as unnecessary
studying unwanted materials. you can assist your algorithm by feeding in only those features
that are really important for the predictive model. So, you have to choose best combination of
possible features for training the machine learning algorithm. This process is known as
feature selection.
Main reasons to use feature selection are:
It enables the machine learning algorithm to train faster.
It reduces the complexity of a model and makes it easier to interpret.
It improves the accuracy of a model if the right subset is chosen.
It reduces overfitting.
There are three methods for feature selection are available
Filter Methods
Wrapper Methods
Embedded Methods
Filter method
In this method the selection of features is independent of any machine learning algorithms.
The features are selected on the basis of their scores in various statistical tests for their
correlation with the outcome variable.
Pearson’s Correlation:
It quantifies linear dependence between two continuous variables X and Y. It varies from -1
to +1.
cov ( X , y)
Pearson’s correlation is given as: ρ X , y =
σX σy
In this experiment Pearson’s correlation is used to select the best features.
Database used:
There are 3 databases used.
(i) diabetes database,
(ii) lung cancer database.
(iii) Shape database
1st 2 databases are one dimensional database. Another database is 2-dimensional database.
For 2nd database we have created excel sheet to save the name of files presented in the
database.
Diabetes database:
It consists of 9 columns (9 feature vectors). The description is as follows
Pregnancies: Number of times pregnant
Glucose: Plasma glucose concentration a 2 hours in an oral glucose tolerance test
BloodPressure: Diastolic blood pressure (mm Hg)
SkinThickness: Triceps skin fold thickness (mm)
Insulin: 2-Hour serum insulin (mu U/ml)
BMI: Body mass index (weight in kg/(height in m)^2)
DiabetesPedigreeFunction: Diabetes pedigree function
Age: Age (years)
Outcome: Class variable (0 or 1)
5.4 Algorithm
Step-1:
Step-2:
5.5 Programme
Programme-1: Load the diabetes dataset and selection of best predictive feature using 1-D
database
//Experiment number-5
//Loading the dataset and selection of best features using 1-D data
clc
clear ;
// load the database
a=read_csv('diabetes.csv'); // It converts all the content to string
b=csvRead('diabetes.csv'); // It only shows numerical value
count=zeros(1,n);
for i=1:n
c=b(:,i);
for j=1:m
if ~isnan(c(j))
count(1,i)=count(1,i)+1;
end
end
loc1=find(~isnan(c))
d=c(loc1)
mean1(1,i)=mean(d);
std1(1,i)=stdev(d);
min1(1,i)=min(d);
max1(1,i)=max(d);
e(:,i)=gsort(d,'g','i');
ad1=round(length(d)*0.25);
ad2=round(length(d)*0.5);
ad3=round(length(d)*0.75);
f25(1,i)=e(ad1,i)
f50(1,i)=e(ad2,i)
f75(1,i)=e(ad3,i)
end
features=[count; mean1; std1; min1; max1; f25; f50; f75];
disp(features);
figure(1);
histplot(20,b(2:m,1), style=2)
title("pregnencies histogram plot")
figure(2);
histplot(20,b(2:m,2),style=2)
title("Glucose histogram plot")
figure(3);
histplot(20,b(2:m,3),style=2)
title("BloodPressure histogram plot")
figure(4);
histplot(20,b(2:m,4),style=2)
title("SkinThickness histogram plot")
figure(5);
histplot(20,b(2:m,5),style=2)
title("Insulin histogram plot")
figure(6);
histplot(20,b(2:m,6),style=2)
title("BMI histogram plot")
figure(7);
histplot(20,b(2:m,7),style=2)
title("DiabetesPedigreeFunction histogram plot")
figure(8);
histplot(20,b(2:m,8),style=2)
title("Age histogram plot")
figure(9);
histplot(20,b(2:m,9),style=2)
title("Outcome histogram plot")
best_features=[bestfr' bestfc'];
disp(best_features);
Programme-2: Load the diabetes dataset and selection of best predictive feature using 2-D
database
//Experiment number-5
//Loading the dataset and selection of best features using 2-D data
clc;
clear
[fd,SST,Sheetnames,Sheetpos]=xls_open('imagedataset.xls');
for l=1:30
a11=SST(l)
a22=strcat([a11,".png"])
a=imread(a22);
b=rgb2gray(a);
// figure
//imshow (b);
a33=im2double(b);
mean1(l,1)=mean(a33);
std1(l,1)=stdev(a33);
min1(l,1)=min(a33);
max1(l,1)=max(a33);
// h11(l,1)=histc(a33);
end
features=[mean1,std1,min1,max1];
disp("The features are", features)
best_features=[bestfr' bestfc'];
disp(best_features);
5.6 Results
Pre-lab Answers
1.
2.
3.
Post-lab Answers
1.
2.
3.
4.
5.10 Conclusion
6.1 Objective
To write a program on clustering the dataset
6.2 Tasks
I. Create datasets.
III. Application of K mean clustering algorithm to cluster the dataset into k number of
clusters
6.3 Theory
Clustering is a technique for finding similarity groups in data, called clusters. I.e., it groups
data instances that are similar to (near) each other in one cluster and data instances that are
very different (far away) from each other into different clusters. Clustering is often called an
unsupervised learning task as no class values denoting an a priori grouping of the data
instances are given, which is the case in supervised learning. Due to historical reasons,
clustering is often considered synonymous with unsupervised learning. In fact, association
rule mining is also unsupervised.
K-means clustering: K-means is a partitional clustering algorithm. Let the set of data points
(or instances) D be {x1, x2, …, xn},
where xi = (xi1, xi2, …, xir) is a vector in a real-valued space X Rr, and r is the number of
attributes (dimensions) in the data.
For example: For diabetic database, It consists of 9 columns (9 feature vectors).(Pregnancies,
Glucose, BloodPressure, Skin Thickness, Insulin, BMI, DiabetesPedigreeFunction, Age,
Outcome) and 763 rows (patients)
The k-means algorithm partitions the given data into k clusters. Each cluster has a cluster
center, called centroid. k is specified by the user.
Step-1: Randomly choose k data points (seeds) to be the initial centroids, cluster centers
Step-2: Assign each data point to the closest centroid (closest point can be estimated by
calculating distance).
Cj is the jth cluster, mj is the centroid of cluster Cj (the mean vector of all the data points in
Cj), and dist(x, mj) is the distance between data point x and centroid mj.
6.4 Algorithm
Step-1:
Step-2:
6.5 Program
Programme-1: k-mean clustering algorithm
//%% K-means
function [CENTS, DAL]=km_fun(F, K, KMI)
CENTS = F( ceil(rand(K,1)*size(F,1)) ,:); // Cluster Centers
DAL = zeros(size(F,1),K+2); //Distances and Labels
for n = 1:KMI
for i = 1:size(F,1)
for j = 1:K
DAL(i,j) = norm(F(i,:) - CENTS(j,:));
end
[Distance, CN] = min(DAL(i,1:K)); // 1:K are Distance from Cluster Centers
1:K
DAL(i,K+1) = CN; // K+1 is Cluster Label
DAL(i,K+2) = Distance; // K+2 is Minimum Distance
end
for i = 1:K
A = (DAL(:,K+1) == i); // Cluster K Points
CENTS(i,:) = mean(F(A,:)); // New Cluster Centers
if sum(isnan(CENTS(:))) ~= 0 // If CENTS(i,:) Is Nan Then Replace It With
Random Point
NC = find(isnan(CENTS(:,1)) == 1); // Find Nan Centers
for Ind = 1:size(NC,1)
CENTS(NC(Ind),:) = F(randi(size(F,1)),:);
end
end
end
end
end
figure;
PT = feature_vector(data(:,number_of_clusters+1) == 1, :);
plot(PT(:, 1),PT(:, 2),'bo', 'LineWidth', 2);
set(gca(),"auto_clear","off")
PT = feature_vector(data(:,number_of_clusters+1) == 2, :);
plot(PT(:, 1),PT(:, 2),'ro', 'LineWidth', 2);
set(gca(),"auto_clear","off")
PT = feature_vector(data(:,number_of_clusters+1) == 3, :);
plot(PT(:, 1),PT(:, 2),'go', 'LineWidth', 2);
set(gca(),"auto_clear","off") //% Plot points with determined color and shape
PT = feature_vector(data(:,number_of_clusters+1) == 4, :);
plot(PT(:, 1),PT(:, 2),'yo', 'LineWidth', 2);
set(gca(),"auto_clear","off") //% Plot points with determined color and shape
plot(cluster_centers(:, 1), cluster_centers(:, 2), '*k', 'LineWidth', 7); //% Plot cluster centers
6.6 Results
Pre-lab Answers
1.
2.
3.
4.
6.9 Post-lab Questions
1.Q, Develop the programme to generate the random 1000 datapoints by using multivariate
generated dataset.
2.Q. Cluster the data in the post lab question-1 in to 3 clusters by using use k-mean clustering
with number of iterations=50, plot the clusters with centroids.
3. Q, Develop the programme to generate the random 1000 datapoints by using multivariate
generated dataset.
4.Q. Cluster the data in the post lab question-3 in to 5 clusters by using use k-mean clustering
with number of iterations=60, plot the clusters with centroids.
Post-lab Answers
1.
2.
3.
4.
6.10 Conclusion
7. LOGIC GATE FUNCTION DESCRIPTION WITH HEBB RULE
7.1 Objective
Logic gate function description with Hebb rule
7.2 Tasks
I. Develop programme to Implement of Hebb network to classify 2 dimensional patterns
II. Develop program to compute weight and bias using Hebb rule with target created by AND
logic gate
7.3 Theory
Definition Of neural network: neural network is a massively parallel distributed processor
made up of simple processing units that has a natural propensity (natural tendency to behave
in a particular way) for storing experiential knowledge and making it available for use. It
resembles the brain in two respects:
1. Knowledge is acquired by the network from its environment through a learning process.
2. Interneuron connection strengths, known as synaptic weights, are used to store the
acquired knowledge.
The procedure used to perform the learning process is called a learning algorithm, the
function of which is to modify the synaptic weights of the network in an orderly fashion to
attain a desired design objective. The modification of synaptic weights provides the
traditional method for the design of neural networks.
MODELS OF A NEURON
A neuron is an information-processing unit that is fundamental to the operation of a neural
network. The block diagram of figure is shown below shows the model of a neuron, which
forms the basis for designing a large family of neural networks. Here, we identify three basic
elements of the neural model:
The first subscript in w kj refers to the neuron in question, and the second subscript refers to
the input end of the synapse to which the weight refers. Unlike the weight of a synapse in the
brain, the synaptic weight of an artificial neuron may lie in a range that includes negative as
well as positive values.
2. An adder for summing the input signals, weighted by the respective synaptic strengths of
the neuron; the operations described here constitute a linear combiner.
3. An activation function for limiting the amplitude of the output of a neuron. The activation
function is also referred to as a squashing function, in that it squashes (limits) the permissible
amplitude range of the output signal to some finite value.
According to the error the updating weight and bias is known as learning process in neural
network.
Hebb network algorithm (Hebb learning rule) is used to update the weight and bias.
Step-2: Weight adjustment and bias adjustments are performed as follows: x i is input, y is
target
Using Hebb rule, find weights required to perform the following classification of given input
pattern: + symbol represent the value 1, and empty sequence indicates -1, Consider ‘I’
belongs to member of class has target value 1 and ‘O’ does not belong to member of class so
has target value -1. Implement manual method to calculate new weight and bias
+ + + + + +
+ + +
+ + + + + +
I O
Input/class x1 x2 x3 x4 x5 x6 x7 x8 x9 y
I 1 1 1 -1 1 -1 1 1 1 1
O 1 1 1 1 -1 1 1 1 1 -1
b ( new )=1+(−1) =0
b=0 (ans)
AND gate
x1 x2 y
1 1 1
1 0 0
0 1 0
0 0 0
Here 0 is considered as -1
x1 x2 y
1 1 1
1 -1 -1
-1 1 -1
-1 -1 -1
x1 x2 B y
1 1 1 1
1 -1 1 -1
-1 1 1 -1
-1 -1 1 -1
∆ b= y
x1 x2 b y ∆ w1 ∆ w2 ∆b
1 1 1 1 1 1 1
1 -1 1 -1 -1 1 -1
-1 1 1 -1 1 -1 -1
-1 -1 1 -1 1 1 -1
x1 x2 b y ∆ w1 ∆ w2 ∆b w1 w2 b
1 1 1 1 1 1 1 1 1 1
1 -1 1 -1 -1 1 -1 0 2 0
-1 1 1 -1 1 -1 -1 1 1 -1
-1 -1 1 -1 1 1 -1 2 2 -2
7.4 Algorithm
Step-1:
Step-2:
7.5 Programme
Programme-1: Develop programme to Implement of Hebb network to classify 2
dimensional patterns
// Experiment number-7
//Hebb Net to c l a s s i f y two d ime n s i o n a l input p a t t e r n s
clear ;
clc ;
// Input Pa t t e r n s
E =[1 1 1 1 1 -1 -1 -1 1 1 1 1 1 -1 -1 -1 1 1 1 1];
F =[1 1 1 1 1 -1 -1 -1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1];
x (1 ,1:20) =E;
x (2 ,1:20) =F;
w (1:20) =0;
w=w'
t =[1 -1];
b =0;
for i =1:2
w=w+x(i ,1:20) *t(i);
b=b+t(i);
end
disp ( 'Weight mat r ix ' );
disp (w);
disp ( ' Bi a s ' );
disp (b);
Programme-2: Develop program to compute weight and bias using Hebb rule with
target created by AND logic gate
//Compution of bias and weight by using AND function using Hebb rule
clear ;
clc ;
E=[1,1,-1,-1];
F=[1,-1,1,-1];
x (1 ,1:4) =E;
x (2 ,1:4) =F;
[m,n]=size(E)
B=[1 1 1 1]
for i=1:n
if x(1,i)==1 & x(2,i)==1
y(1,i)=1
elseif x(1,i)==-1 & x(2,i)==1
y(1,i)=-1
elseif x(1,i)==1 & x(2,i)==-1
y(1,i)=-1
else
y(1,i)=-1
end
end
disp("input is", x);
disp("Target is",y)
for i=1:n
delw1(1,i)=x(1,i)*y(1,i)
delw2(1,i)=x(2,i)*y(1,i)
delb(1,i)=y(1,i)
end
disp("Del w1 is",delw1);
disp("Del w2 is",delw2 );
disp("Del bias is", delb );
w1old=0;
w2old=0;
bold=0;
for i=1:n
w1new(1,i)=w1old+delw1(1,i);
w2new(1,i)=w2old+delw2(1,i);
bnew(1,i)=bold+delb(1,i);
w1old=w1new(1,i);
w2old=w2new(1,i);
bold=bnew(1,i)
end
disp(" w1 newis",w1new);
disp("w2 new is",w2new );
disp("new bias is", bnew );
7.6 Results
Pre-lab Answers
1.
2.
3.
4.
+ +
+ + + + +
+ +
I + + O
2.Q. Write the programme for given input in post-lab Q1, Compare result for Question-1 and
question-2
3..Q, Using Hebb rule, find weights required to perform the following classification of given
input pattern: + symbol represent the value 1, and empty sequence indicates -1, Consider ‘I’
belongs to member of class has target value 1 and ‘O’ does not belong to member of class so
has target value -1. Implement manual method to calculate new weight and bias
+ + + + +
+ + + + +
+ +
I O
4.Q. Write the programme for given input in post-lab Q3, Compare result for Question-3 and
question-4
Post-lab Answers
1.
2.
3.
4.
7.10 Conclusion
8.1 Objective
8.2 Tasks
II. Develop program to update weight and bias by using correction learning for AND function
8.3 Theory
MODELS OF A NEURON
2. An adder for summing the input signals, weighted by the respective synaptic strengths of
the neuron; the operations described here constitute a linear combiner.
3. An activation function for limiting the amplitude of the output of a neuron. The activation
function is also referred to as a squashing function, in that it squashes (limits) the permissible
amplitude range of the output signal to some finite value.
Typically, the normalized amplitude range of the output of a neuron is written as the
closed unit interval [0,1], or, alternatively, [-1,1].
The neural model of Fig. 5 also includes an externally applied bias, denoted by b k. The
bias b k has the effect of increasing or lowering the net input of the activation function,
depending on whether it is positive or negative, respectively. In mathematical terms, we
may describe the neuron k depicted in Fig. 5 by writing the pair of equations:
m
uk =∑ w kj x j
j =1
y k =φ(u k + bk )
v k =uk +b k
Where x 1 , x 2 ,… .. x m are input signal, w k 1 , w k 2 ,… .. w km are respective synaptic weights of
neuron k . uk is the linear combiner output due to the input signal. b k is the bias. φ is the
activation function. y k is the output signal of the neuron.
φ ( v )= {10 if v ≥ 0
if v <0
1
φ ( v )=
1−exp (−av )
Fig 9.2 (a) Threshold Function (b) Sigmoid Function for varying slope parameter a
Definition of learning
Learning is a process by which the free parameters of a neural network are adapted through a
process of stimulation by the environment in which the network is embedded. The type of the
learning is determined by the manner in which the parameter changes take place. (Mendel &
McClaren 1970).
Error-correction learning
ek(n) actuates a control mechanism to make the output signal yk(n) come closer to the desired
response dk(n) in step by step manner
A cost function (n) = ½e²k(n) is the instantaneous value of the error energy -> a steady
state
x1 w1 w0
Activatio
y
∑ n
function
x2 w2
AND Function
x1 x2 y
0 0 0
0 1 0
1 0 0
1 1 1
AND-NOT Function
x1 x2 y
0 0 0
0 1 0
1 0 1
1 1 0
OR Function
x1 x2 y
0 0 0
0 1 1
1 0 1
1 1 1
8.4 Algorithm
Step-1:
Step-2:
8.5 Programme
clear;
clc;
//number of inputs
n=input("Enter number of inputs for neural network");
u=0;
for i=1:n
u=u+(x(i)*w(i));
end
v=u+b;
// Activation function
// Threshold activation function
if v>=0
y1=1
else
y1=0
end
disp("Output considering activation function as threshold function", y1)
// sigmoid activation function
a=1
y2=(1/(1-exp(-a*v)))
if y2>=0.5
y3=1
else
y3=0
end
disp("Output considering activation function as sigmoid function", y3)
Programme-2: Develop program to update weight and bias by using correction learning
for AND function
ya=rand(4,1);
while flag==0 do
for i=1:4
for j=1:3
net=net+w(1,j)*x(i,j);
end;
if net >= thresh then // threshold activation function
ya(i,1)=1;
else
ya(i,1)=0;
end;
err=yd(i,1)-ya(i,1);
for j=1:3
w(1,j)=w(1,j)+ (lr*x(i,j)*err); // error correction learning
end;
net=0.00; //Reset net for next iteration
end
disp(ya,"Actual Output");
disp(yd,"Desired Output");
epoch=epoch+1;
disp("End of Epoch No:");
disp(epoch);
disp("************************************************************’");
if epoch > 1000 then
disp("Learning Attempt Failed !")
break
end;
if yd(1,1) == ya(1,1)& yd(2,1) == ya(2,1) & yd(3,1) == ya(3,1) & yd(4,1) == ya(4,1) then
flag=1;
else
flag=0;
end
end
disp("Initial Random Weights -");
disp(w1);
disp("Final Adjusted Weights -");
disp(w);
disp(lr,"Learning rate is – ")
disp("***********************************’")
plot(yd,ya);
8.6 Results
Pre-lab Answers
1.
2.
3.
4.
1. Q. Develop program to update weight and bias by using correction learning for OR
function. With learning rate 0.3.
2. Q. Develop program to update weight and bias by using correction learning for OR
function. With learning rate 0.7.
3. Q. Develop program to update weight and bias by using correction learning for AND-
NOT function with learning rate 0.4.
4. Q. Develop program to update weight and bias by using correction learning for AND-
NOT function with learning rate 0.6.
1.
2.
3.
4.
8.11 Conclusion
9. XOR problem with Perceptron network
9.1 Objective
9.2 Tasks
9.3 Theory
The multilayer perceptron network consists of input layer, output layer and hidden layers.
Input Output
layer layer
Hidden Layers
9.4 Algorithm
Step-1:
Step-2:
9.5 Programme
// Ge t t ing we i g h t s and t h r e s h o l d v a l u e
disp ( ' Ent e r we i g h t s ' );
w11 = input ( 'Weight w11=' );
w12 = input ( ' we i ght w12=' );
w21 = input ( 'Weight w21=' );
w22 = input ( ' we i ght w22=' );
v1= input ( ' we i ght v1=' );
v2= input ( ' we i ght v2=' );
disp ( ' Ent e r Thr e sho ld Value ' );
theta = input ( ' t h e t a=' );
x1 =[0 0 1 1];
x2 =[0 1 0 1];
z =[0;1;1;0];
con =1;
while con
zin1 =x1* w11 +x2*w21;
zin2 =x1* w21 +x2*w22;
for i =1:4
if zin1 (i) >= theta
y1(i)=1;
else
y1(i)=0;
end
if zin2 (i) >= theta
y2(i)=1;
else
y2(i)=0;
end
end
yin =y1*v1+y2*v2;
for i =1:4
if yin (i) >= theta ;
y(i)=1;
else
y(i)=0;
end
end
disp ( ' Output o f Net ' );
disp (y);
if y == z
con =0;
else
disp ( ' Net i s not l e a r n i n g e n t e r ano the r s e t o f we i g h t s and Thr e sho ld v a l u e
' );
end
end
disp ( 'McCul lochP i t t s Net f o r XOR f u n c t i o n ' );
disp ( 'Weight s o f Neuron Z1 ' );
disp (w11);
disp (w21);
disp ( ' we i g h t s o f Neuron Z2 ' );
disp (w12);
disp (w22);
disp ( ' we i g h t s o f Neuron Y' );
disp (v1);
disp (v2);
disp ( ' Thr e sho ld v a l u e ' );
disp ( theta );
//For this programme w11=1, w21=-1, w12=-1, w22=1, v1=1, v2=1, theta=1
clc ;
clear ;
x =[1 1 -1 -1;1 -1 1 -1]; // input
t=[ -1 1 1 -1]; // t a r g e t
// as suming i n i t i a l we i ght mat r ix and b i a s
w =[0.05 0.1;0.2 0.2];
b1 =[0.3 0.15];
disp("Initial weight",w);
disp("bias",b1);
v =[0.5 0.5];
b2 =0.5;
con =1;
alpha =0.5; // learning parameter
epoch =0;
while con
con =0;
for i =1:4
for j =1:2
zin (j)=b1(j)+x(1,i)*w(1,j)+x(2,i)*w(2,j); // n e u r a l f u n c t i n output
if zin (j) >=0 then
z(j)=1;
else
z(j)= -1;
end
end
yin =b2+z(1)*v (1) +z(2)*v (2) ;
if yin >=0 then
y =1;
else
y= -1;
end
if y~=t(i) then
con =1;
if t(i) ==1 then
if abs ( zin (1))>abs (zin (2) ) then
k =2;
else
k =1;
end
b1(k)=b1(k)+ alpha *(1 - zin (k));
// upg r ading b i a s
w(1:2 ,k)=w(1:2 ,k)+ alpha *(1 - zin(k))*x(1:2 ,i); // upg r ading we i ght
else
for k =1:2
if zin (k) >0 then
9.6 Results
2.Q. Draw the diagram for implementation of XOR gate using multi-layer perceptron
network
Pre-lab Answers
1.
2.
Post-lab Answers
1.
2.
9.10 Conclusion
10.1 Objective
10.2 Tasks
10.3 Theory
Hopfield neural network was invented by Dr. John J. Hopfield in 1982. It consists of a single
layer which contains one or more fully connected recurrent neurons. The Hopfield network
is commonly used for auto-association and optimization tasks.
A Hopfield network which operates in a discrete line fashion or in other words, it can be said
the input and output patterns are discrete vector, which can be either binary 0,1 or
bipolar +1,−1 in nature. The network has symmetrical weights with no self-connections
i.e., w ij = w ji and w ii= 0.
Architecture
Following are some important points to keep in mind about discrete Hopfield network −
This model consists of neurons with one inverting and one non-inverting output.
The output of each neuron should be the input of other neurons but not the input of
self.
The output from Y 1 going to Y 2, Y i and Y n have the weights w 12, w 1 i and w 1 n
respectively. Similarly, other arcs have the weights on them.
Training Algorithm
During training of discrete Hopfield network, weights will be updated. As we know that we
can have the binary input vectors as well as bipolar input vectors. Hence, in both the cases,
weight updates can be done with the following relation
P
w ij =∑ ¿ ¿ sip ][2 s jp] for i≠ j
p=1
Step 1 − Initialize the weights, which are obtained from training algorithm by using Hebbian
principle.
Step 2 − Perform steps 3-9, if the activations of the network is not consolidated.
Step 4 − Make initial activation of the network equal to the external input vector X as
follows : y i=x i for i=1 , 2 ,3 … … . n
y ini =x i+ ∑ y j w ji
j
Step 7 − Apply the activation as follows over the net input to calculate the output:
{
1 If yini >θi
y i= y i If y ini =θi
0 If yini <θi
Algorithm
Step-1:
Step-2:
10.5 Programme: Develop a programme to train and test a discrete Hopfield network
// Di s c r e t e Ho p f i e l d ne t
clear ;
clc ;
x =[1 1 1 0]; // target
tx =[0 0 1 0]; //input
//training
w1 =(2*x ' -1);
w2 =(2*x -1) ;
w=w1*w2; // creation of weights
for i =1:4
w(i,i)=0;
end
// testing
con =1;
y =tx; //assignment of input
epoch=0;
while con
up =[4 2 1 3];
epoch=epoch+1;
for i =1:4
yin (up(i))=tx(up(i))+y*w(1:4 , up(i)); //calculate net input
if yin (up(i)) >0
y(up(i)) =1; // application of activation function
end
end
disp("The epoch number", epoch);
disp ("ouput in loop",y);
if y==x
disp ( ' Convergence e has been o b t a i n e d ' );
disp ( 'The Converged Output ' );
disp (y);
disp("epoch number", epoch);
con =0;
end
end
10.6 Results
10.7 Screen Shots
Pre-lab Answers
1.
2.
3.
1. Q. develop a programme to train and test discrete Hopfield network with binary input
pattern, where input is [0 0 1 0] and target is [0 1 1 0];
2. Q. develop a programme to train and test discrete Hopfield network with bipolar input
pattern, where input is [-1 -1 1 -1] and target is [-1 1 1 -1];
Post-lab Answers
1.
2.
10.10 Conclusion
11.1 Objective
11.2 Tasks
11.3 Theory
Associative memory network:
These kinds of neural networks work on the basis of pattern association, which means they
can store different patterns and at the time of giving an output they can produce one of the
stored patterns by matching them with the given input pattern. These types of memories are
also called Content-Addressable Memory CAM. Associative memory makes a parallel
search with the stored patterns as data files.
This is a single layer neural network in which the input training vector and the output target
vectors are the same. The weights are determined so that the network stores a set of patterns.
Architecture
As shown in the following figure, the architecture of Auto Associative memory network
has ‘n’ number of input training vectors and similar ‘n’ number of output target vectors
Training Algorithm
For training, this network is using the Hebb or Delta learning rule.
x i=si ¿=1 to n)
Testing Algorithm
Step 1 − Set the weights obtained during training for Hebb’s rule.
Step 3 − Set the activation of the input units equal to that of the input vector.
n
y inj =∑ x i wij
i=1
y j=f ¿ )=
{+1 if y inj > 0
−1 if y inj ≤ 0
Similar to Auto Associative Memory network, this is also a single layer neural network.
However, in this network the input training vector and the output target vectors are not the
same. The weights are determined so that the network stores a set of patterns. Hetero
associative network is static in nature, hence, there would be no non-linear and delay
operations.
Architecture
Training Algorithm
For training, this network is using the Hebb or Delta learning rule.
x i=si ¿=1 to n)
y j=s j ¿=1 to m)
Testing Algorithm
Step 1 − Set the weights obtained during training for Hebb’s rule.
Step 3 − Set the activation of the input units equal to that of the input vector.
n
y inj =∑ x i wij
i=1
11.4 Algorithm
Step-1:
Step-2:
11.5 Programme
clear ;
clc ;
// Au o t a s s o c i a t i v e ne t to s t o r e the v e c t o r
x =[1 1 -1 -1];
xv =[1;1; -1; -1];
//Training
w= zeros (4 ,4);
w=x '*x;
//Testing
yin =x*w;
for i =1:4
if yin (i) >0
y(i)=1;
else
y(i)= -1;
end
end
disp('---Auto associative network---');
disp ( 'Weight mat r ix ' );
disp (w);
disp('output',y);
if xv ==y
disp ( 'The v e c t o r i s a Known Ve c tor ' );
else
disp ( 'The v e c t o r i s an Unknown Ve c tor ' );
end
// He te ro a s s o c i a t i v e n e u r a l ne t
x =[1 1 0 0;1 0 1 0;1 1 1 0;0 1 1 0];
t =[1 0;1 0;0 1;0 1];
//Training
w= zeros (4 ,2);
for i =1:4
w=w+x(i ,1:4) '*t(i ,1:2) ;
end
disp('--Hetero asspciative memory network---')
disp ( 'Weight mat r ix ' );
disp (w);
11.6 Results
11.7 Screen Shots
Pre-lab Answers
1.
2.
3.
4.
5.
[ ] [ ]
1100 1 0
1101 0 1
by using input and target
1011 1 1
0100 0 0
Post-lab Answers
1.
2.
11.10 Conclusion
12.1 Objective
To evaluate error in Back propagation network (BPN)
12.2 Tasks
12.3 Theory
Back propagation algorithm
• Backward pass: start at the output layer, and pass the errors backwards
through the network, layer by layer, by recursively computing the local
gradient of each neuron.
Step-1: Initialize weights and learning rate (random small values are taken) (in problem it
will be provided).
(Z¿ ¿ j)¿ =¿ ¿
Step-3: Calculate the output for each hidden layer Z j by applying activation function. a=1
here, (sigmoid)
p
Calculate net input ( y ¿¿ k )¿ =¿ ¿ w 0 k + ∑ z i wik
i=1
d is target here.
Problem
1.Q. Calculate the new weight of the multilayer perceptron neural network. If x 1=0 ,
x 2=1 , w01=0.3, w 02=0.5, w 11=0.6 , w21=−0.1, w 12=−0.3 ,
w 22=0.4 , w 0=−0.2 , w 1=0.4 , w 2=0.1. Target output=1, Learning rate=0.25, use binary
sigmoid activation function.
w 01 w0
1
1
x1 Z1
w 11
w1 y
w 21 w 12
w2
x2 Z2
w 22
w 02
Solution
Step-2: Calculate net input ton each hidden layer Z j for j=1 ,2 , … . , n
w 0 j + ∑ x i wij
i=1
(Z¿ ¿ j)¿ =¿ ¿
w 12=−0.3 , w22=0.4
Step-3: Calculate the output for each hidden layer Z j by applying activation function. a=1
here, (sigmoid)
1 1
z 1= −(Z ¿¿ 1)¿ = −0.2
=¿0.5498
1+ e ¿ 1+ e
1 1
z 2= −(Z ¿¿2)¿ = −0.9
=¿0.7109
1+ e ¿ 1+ e
p
Calculate net input ( y ¿¿ k )¿ =¿ ¿ w 0 k + ∑ z i wik
i=1
( y )¿=w0 +¿ Z1 w 1+ Z 2 w2
=-0.2+(0.5498*(0.4))+(0.7109*0.1)=0.09101
1 1
y= −( y) = −0.09101
=¿0.5227
1+e 1+ e
¿
Here target is 1.
=0.5227(1-0.5227)(1-0.5227)=0.1191
Step-8: compute change in weight in output layer
Δ w i=η δ j Z i
Δ w 0=η δ 1=0.25∗0.1191=0.02978
=0.0118
=0.00295
Δ w 11=η δ 11 x1 =¿0.25*0.0118*0=0
Δ w 21=η δ 11 x 2=¿0.25*0.0118*1=0.00295
Δ w 01=η δ 11 =¿0.25*0.0118=0.00295
After updating weights calculate output (step-2 to step-6). After calculating the output,
subtract from the target to get the error.
12.4 Algorithm
Step-1:
Step-2:
12.5 Programme
clc;
clear all;
//Step-3
z1=1/(1+exp(-z1in));
z2=1/(1+exp(-z2in));
disp("z1",z1);
disp("z2",z2);
//step-4 and 5
yin=wo(1)+(z1*wo(2))+(z2*wo(3));
//step-6
y=1/(1+exp(-yin));
disp("Output",y);
// step-10
delw01=n*del11;
delw11=n*del11*x(1);
delw21=n*del11*x(2);
delw02=n*del22;
delw12=n*del22*x(1);
delw22=n*del22*x(2);
disp("---Change in weight of hidden layer---")
disp(delw01);
disp(delw11);
disp(delw21);
disp(delw12);
disp(delw22);
disp(delw02);
delwh=[delw01 delw11 delw21 delw12 delw22 delw02];
//new weight (updataion of weight)
wh=wh+delwh;
wo=wo+delwo;
disp("new weight of hidden layer",wh);
disp("new weight of output layer",wo);
// Calculation of output after weight updation
//step-2
// estimation of Zjin
z1in=wh(1)+(x(1)*wh(2))+(x(2)*wh(3));
z2in=wh(6)+(x(1)*wh(4))+(x(2)*wh(5));
//Step-3
z1=1/(1+exp(-z1in));
z2=1/(1+exp(-z2in));
disp(z1);
disp(z2);
//step-4 and 5
yin=wo(1)+(z1*wo(2))+(z2*wo(3));
//step-6
y=1/(1+exp(-yin));
disp(y);
// error
err=d-y;
disp("Error is",err);
12.6 Results
Pre-lab Answers
1.
2.
Post-lab Answers
1.
2.
12.10 Conclusion
13.1 Objective
Checking input matrix orthogonal or not, if orthogonal then estimate weight
13.2 Tasks
13.3 Theory
If sum of the single row matrix is zero, then it is called orthogonal.
More than one vector can be stored in an auto-associative net by simply adding the weights
needed for each vector. Assume we have two vectors that we want to store in an auto-
associative net. Assume we want to store (1 1 -1 -1) and (-1 1 1 -1) in an auto-associative net.
We obtain the weight matrices for each input and add them up.
This is a single layer neural network in which the input training vector and the output target
vectors are the same. The weights are determined so that the network stores a set of patterns.
Architecture
As shown in the following figure, the architecture of Auto Associative memory network
has ‘n’ number of input training vectors and similar ‘n’ number of output target vectors
Training Algorithm
For training, this network is using the Hebb or Delta learning rule.
y j=s j ¿=1 to n)
Testing Algorithm
Step 1 − Set the weights obtained during training for Hebb’s rule.
Step 3 − Set the activation of the input units equal to that of the input vector.
n
y inj =∑ x i wij
i=1
y j=f ¿ )=
{+1 if y inj > 0
−1 if y inj ≤ 0
13.4 Algorithm
Step-1:
Step-2:
13.5 Programme
clc
clear;
13.6 Results
1. Explain the difference between auto- association network and hetero associative network.
2. How will you estimate the orthogonality of a input of single row matrix .
Pre-lab Answers
1.
2.
1. Give 3 inputs(matrixes) having 1 row and 4 columns and input bipolar values (1 or -
1). Check whether all the input vectors are orthogonal or not.
Post-lab Answers
1.
13.10 Conclusion