Master in Electronics Engineering (Communication) (Third
Semester)
Winter – 2022 (Resit)
End Semester Examination
Course Name: MOOCs (Microprocessor and Microcontroller)
Course Code: ____________
Time: 3 Hours] [Max. Marks: 75
Instructions to Candidates:
1. All questions carry marks as indicated.
2. All the questions in Section A and Section B are compulsory.
3. Section A having 25 questions with weightage of 1 marks and Section B having 25
questions with the weightage of 2 marks.
Paper setter are requested to provide answer sheet separately
Roll No of Students: ______________________
Group-A
A Tick correct options Marks
Que.1 1
Que.2
Que.3
Que.4
Que.5 There are 5 black 7 white balls. Assume we have drawn two balls randomly
one by one without any replacement. What will be the probability that both
balls are black? 1
a) 20/132 b) 25/144 c) 20/144 d ) 25/132
Que.6 Matrix inverse of a square matrix A exists if.
a) Determinant of A, det(A) = 0 b) Eigen values of A are non-zero 1
c) Sum of eigen values are non-zero d ) None of the above
Que.7 For a two-class problem, the linear discriminant function is given by g(x) =
aty. What is updation rule for finding the weight vector a? Here y is
augmented feature vector. 1
Page 1 of 9
a. a(k + 1) = a(k) + nΣy
b. a(k+1)= a(k) - nΣy
c. a(k+1)= a(k-1)-na(k)
d. a(k+1)= a(k-1)+na(k)
Que.8 You are given some data points for two different class.
Class 1 points: {(11, 11), (13, 11), (8, 10), (9,9), (7,7), (7,5), (15, 3)} 1
Class 2 points: {(7, 11), (15, 9), (15, 7), (13, 5), (14, 4), (9,3), (11, 3)} What
will be the nature of decision boundary between these two classes?
a. Linear
b. Quadratic
c. Cubic
d. None of the above
Que.9 In gradient descent algorithm we move in the direction of which of the
following.
1
a. Negative of the gradient
b. Same as the direction of gradient
c. Negative of absolute error difference
d. None of the above
Que.10 Among following 2 statements, find which of these options is/are true in
case of k-NN?
1
1. In case of very large value of k, we may include points from other classes
into the neighborhood.
2. In case of too small value of k the algorithm is very sensitive to noise.
a.1
b. 2
C.1 and 2
d. None of this
Que.11 You are given some data points for two different class.
Class 1 points: {(11,11), (13,11), (8,10), (9,9), (7,7), (7,5), (15,3)} Class 2 1
points: {(7,11), (15,9), (15,7), (13,5), (14,4), (9,3), (11,3)} Classify the
following two new samples (A = (6,11), B = (14,3)) using K-nearest
neighbor. (Using Manhattan distance as a distance function.)
a. A belongs to class 1 and B belongs to class 2 if K=1.
b. A belongs to class 2 and B belongs to class 2 if K=1.
c. A belongs to class 1 and B belongs to class 2 if K=3.
d. A belongs to class 2 and B belongs to class 2 if K=3.
Page 2 of 9
Que.12 What is the shape of the loss landscape during optimization of SVM?
a. Linear 1
b. Paraboloid
c. Ellipsoidal
d. Non-convex with multiple possible local minimum
Que.13 How many local minimum can be encountered while solving the
optimization for maximizing margin for SVM?
1
a.1
b. 2
C.∞ (infinite)
d. 0
Que.14 Which of the following classifiers can be replaced by a linear SVM?
a. Logistic Regression 1
b. Neural Networks
C.Decision Trees
d. None of the above
Que.15 For a 2-class problem what is the minimum possible number of support
vectors. Assume there are more than 4 examples from each class?
1
a. 4
b. 1
C.2
d. 8
Que.16 Which one of the following is a valid representation of hinge loss (of
margin = 1) for a two-class problem?
1
y = class label (+1 or -1).
p = predicted (not normalized to denote any probability) value for a class.?
a. L(y, p) = max(0, 1- yp)
b. L(y, p) = min(0, 1- yp)
c. L(y, p) = max(0, 1 + yp)
d. None of the above
Que.17 What will happen to the margin length of a max-margin linear SVM if one
of non-support vector training example is removed??
1
a. Margin will be scaled down by the magnitude of that vector
Page 3 of 9
b. Margin will be scaled up by the magnitude of that vector
c. Margin will be unaltered
d. Cannot be determined from the information provided
Que.18 Among the following options which can be used to curb the problem of
overfitting?
1
a. Regularization
b. Training the network for longer time
c. Introducing more complex model architecture
d. Modifying the cost function that enhances the weights of the model
parameters by a constant value
Que.19 Which among the following options give the range for a logistic function?
a. -1 to 1 1
b. 1 to 0
C.0 to 1
d. O to infinity
Que.20 An artificial neuron receives n inputs X1, X2, X3,.....Xn with weights W1,
W2, W3,Wn attached to the input links. The weighted
sum------------------------ is computed to be passed on to anon-linear filter ᴓ 1
called activation function to release the output. Fill in the blanks by
choosing one option from the following.
a. Σiwi
b. Σixi
c. Σi wi + Σi xi
d. Σiwixi
Que.21 The activation function which is analytically differentiable for all real
values of the given input is
1
a. Sigmoid
b. tanh
C.ReLU
d. Both a & b
Que.22 Which of the following is a Co-occurrence matrix-based descriptor?
a) Entropy b) Uniformity c) Inverse Element difference moment d ) 1
All of the above
Que.23 What is the main benefit of stacking multiple layers of neuron with non-
linear activation functions over a single layer perceptron?
1
a. Reduces complexity of the network
b. Reduce inference time during testing
Page 4 of 9
c. Allows to create non-linear decision boundaries
d. All of the above
Que.24 Suppose a neural network has 3 input 3 nodes, x, y, z. There are 2 neurons,
Q and F. Q = x + y and F=Q*z. What is the gradient of F with respect to x,
y and z ? Assume, (x, y, z) = (-2, 5, -4). 1
a. (-4, 3, -3)
b. (-4, -4, 3)
c. (4, 4, -3)
d. (3, 3, 4)
Que.25 Which of the following is a boundary descriptor?
a) Polygonal Representation b) Fourier descriptor c) Signature d ) All 1
of the above
Group-B
B Tick correct options Marks
A single card is drawn from a standard deck of playing cards. What is the
probability of that a queen is drawn from the deck of cards provided that the
card is a face card?
Que.26 2
(Hints: face cards are the cards having a face i.e. Jack, Queen, King)
a) 3/13 b) 1/13 c) 4/13 d ) 1/52
Que.27 The texture of the region provides measure of which of the following 2
properties?
a) Smoothness alone b) Coarseness alone c) Regularity alone
d ) Smoothness, coarseness and regularity
Que.28 Why convolution neural network is taking off quickly in recent times? 2
(Check the options that are true.)
a) Access to large amount of digitized data b) Integration of feature
extraction within the training process. c) Availability of more
computational power d ) All of the above
Que.29 The plot of distance of the different boundary point from the centroid of the 2
shape taken at various direction is known as
a) Signature descriptor b) Polygonal descriptor c) Fourier
descriptor d ) Convex Hull
Que.30 Which of the following properties, if present in an activation function 2
CANNOT be used in a neural network?
a. The function is periodic
b. The function is monotonic
C. The function is unbounded
d. Both a and b
Page 5 of 9
Que.31 Suppose we have a 10-hidden layer neural network. The activation function 2
for each node is: f(x) =x. What is the minimum number of hidden layers
required to realize the above network so that we get same output for a given
input? Consider there are NO bias nodes in any layer.
a.2
b. 1
C.9
d. 4
Que.32 Suppose a fully-connected neural network has a single hidden layer with 15 2
nodes. The input is represented by a 5D feature vector and the number of
classes is 3. Calculate the number of parameters of the network. Consider
there are NO bias nodes in the network?
a. 225
b. 75
c. 78
d. 120
Que.33 For a 2-class classification problem, what is the minimum number of nodes 2
required for the output layer of a multi-layered neural network?
a. 2
b. 1
C. 3
d. None of the above
Que.34 Suppose the input layer of a fully-connected neural network has 4 nodes. 2
The value of a node in the first hidden layer before applying sigmoid
nonlinearity is V. Now, each of the input layer's nodes are scaled up by 8
times. What will be the value of that neuron with the updated input layer?
a. 8V
b. 4V
C. 32V
d. Remain same since scaling of input layers does not affect the hidden
layers
Que.35 Which of the following are potential benefits of using ReLU activation over 2
sigmoid activation?
a. ReLu helps in creating dense (most of the neurons are active)
representations
b. ReLu helps in creating sparse (most of the neurons are non-active)
representations
C. ReLu helps in mitigating vanishing gradient effect
Page 6 of 9
d. Both (b) and (c)
Que.36 Given input x and linear autoencoder (no bias) with random weights (W for 2
encoder and W' for decoder), what mathematical form is minimized to
achieve optimal weights?
a. x-(W' .W⋅ x)|
b. x-(WW' x)|
C. x-(W W x)|
d. x-(W'W' · x)]
Que.37 During back-propagation through max pooling with stride the gradients are 2
a. Evenly distributed
b. Sparse gradients are generated with non-zero gradient at the max response
location
c. Differentiated with respect to responses
d. None of the above
Que.38 What is the output of sigmoid function for an input with dynamic range[0, 2
∞o]?
a. [0,1]
b. [-1,1]
c. [0.5,1]
d. [0.25, 1]
Que.39 What will happen when learning rate is set to zero? 2
a. Weight update will be very slow
b. Weights will be zero
c. Weight update will tend to zero but not exactly zero
d. Weights will not be updated
Que.40 What is the similarity between an autoencoder and Principle Component 2
Analysis (PCA)?
a. Both assume nonlinear systems
b. Subspace of weight matrices
C. Both can assume linear systems
d. All of these
Que.41 Which of the following is necessary for supervised learning problem? 2
a. Big Dataset
b. Pre-trained deep learning model
C. Ground Truth/ Label
Page 7 of 9
d. None of the above is necessary
Que.42 Which of the following criteria is essential while designing a neural 2
network?
A - Derivative of cost function should exist.
B - Derivative of non-liner transfer function should exist.
C-L2 loss should be used as cost function.
a. Only A
b. Only C
C.Both A and B
d. Only B
Que.43 What are the advantages of initializing MLP with pre-trained autoencoder 2
weights?
a. Faster Convergence & Avoid overfitting
b. Faster Convergence & Simpler hypothesis
C.Faster Convergence, Avoid overfitting & Simpler hypothesis
d. None of these
Que.44 A no bias autoencoder consists of 100 input neurons, 50 hidden neurons. If 2
the network weights are represented using single precision floating point
numbers then what will be size of weight matrix?
a. 10,000 Bytes
b. 10,150 Bits
c. 40,000 Bytes
d. 40,600 Bytes
Que.45 Identify the techniques which can be used for training autoencoders 2
1. Training one layer at a time
2. Training the encoder first and then the decoder
3. End-to-end training
a.1 & 2
b. 2 & 3
C.1 & 3
d. 1, 2 & 3
Que.46 Which of the following autoencoder is not a regularization autoencoder? 2
a. Sparse autoencoder
b. Denoising autoencoder
C.Both a and b
Page 8 of 9
d. Stack autoencoder
Que.47 Regularization of Contractive Autoencoder is imposed on 2
a. Activations
b. Weights
c. Weights and Activations
d. Does not use regularization
Que.48 Which of the following is not the purpose of cost function in training 2
denoising autoencoders?
a. Dimension reduction
b. Error minimization
c. Weight Regularization
d. Image denoising
Que.49 What is the KL Divergence between two equal distributions? 2
a.1
b. +∞
C. -∞
d. 0
Que.50 What is the role of sparsity constraint in a sparse autoencoder? 2
a. Control the number of active nodes in a hidden layer
b. Control the noise level in a hidden layer
C. Control the hidden layer length
d. Not related to sparse autoencoder
Name of Paper setter
Prof. Suraj Mahajan
Page 9 of 9