NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
Deep Learning
Assignment- Week 8
TYPE OF QUESTION: MCQ/MSQ
Number of questions: 10 Total mark: 10 X 1 = 10
______________________________________________________________________________
QUESTION 1:
A SoftMax layer is applied to an activation vector 𝒙 ∈ 𝑹𝒏 :
𝒚 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝒙)
If a constant vector 𝒄𝟏𝒏 got subtracted from 𝒙 ;
( 𝒄 ∈ 𝑹 is a scalar, 𝟏𝒏 ∈ 𝑹𝒏 is an “n” component vector in 𝑹𝒏 of all 1s)
̂ is :
, such that new output 𝒚
̂ = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝒙 − 𝒄𝟏𝒏 )
𝒚
Select the right option
a. 𝑲𝑳𝑫𝒊𝒗𝒆𝒓𝒈𝒆𝒏𝒄𝒆(𝒚|𝒚̂) = 𝟎
b. 𝑲𝑳𝑫𝒊𝒗𝒆𝒓𝒈𝒆𝒏𝒄𝒆(𝒚|𝒚̂) = 𝟏
c. 𝑲𝑳𝑫𝒊𝒗𝒆𝒓𝒈𝒆𝒏𝒄𝒆(𝒚|𝒚̂) = ∞
̂ 𝒄𝒂𝒏𝒏𝒐𝒕 𝒃𝒆 𝒄𝒐𝒎𝒑𝒂𝒓𝒆𝒅
d. 𝒚 𝒂𝒏𝒅 𝒚
Correct Answer: a
Detailed Solution:
SoftMax function is translation invariant, 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝒙 − 𝒄𝟏𝒏 ) = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝒙)
______________________________________________________________________________
QUESTION 2:
An RGB input image has been converted into a matrix of size 257 X 257 X 3 and a kernel/filter of
size 7 X 7 X 3 with a stride of 2 and padding = 3 is used for 2D convolution. What will be the size
of the output of convolution?
a. 129x129x1
b. 128x128x1
c. 254x254x3
d. 256x256x1
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
Correct Answer: a
Detailed Solution:
The size of the convoluted matrix is given by CxC where C=((I-F+2P)/S)+1, where C is the
size of the Convoluted matrix, I is the size of the input matrix, F the size of the filter matrix
and P the padding applied to the input matrix. Here P=3, I=257, F=7 and S=2. There the
answer is 129x129x1.
______________________________________________________________________________
QUESTION 3:
Primary reason for adding pooling layers is?
a. Promote small shift invariance
b. Reduce computations for subsequent layers
c. To produce activations that summarize filter response in local windows.
d. Both b and c
Correct Answer: d
Detailed Solution:
Pooling layer reduces compute requirements for subsequent layers and locally summarize
response in its input. In contrast to earlier wisdom, Recent research has indicated that
pooling makes images less robust to small spatial shifts (Refer Blurpool).
______________________________________________________________________________
QUESTION 4:
The figure below shows image of a face which is input to a convolutional neural net and the
other three images shows different levels of features extracted from the network. Can you
identify from the following options which one is correct?
a. Label 3: Low-level features, Label 2: High-level features, Label 1: Mid-level
features
b. Label 1: Low-level features, Label 3: High-level features, Label 2: Mid-level
features
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
c. Label 2: Low-level features, Label 1: High-level features, Label 3: Mid-level
features
d. Label 3: Low-level features, Label 1: High-level features, Label 2: Mid-level
features
Correct Answer: b
Detailed Solution:
Convolutional NN will try to learn low-level features such as edges and lines in early layers
then parts of faces of people and then high-level representation of a face.
______________________________________________________________________________
QUESTION 5:
Suppose you have 8 convolutional kernel of size 5 x 5 with no padding and stride 1 in the first
layer of a convolutional neural network. You pass an input of dimension 228 x 228 x 3 through
this layer. What are the dimensions of the data which the next layer will receive?
a. 224 x 224 x 3
b. 224 x 224 x 8
c. 226 x 226 x 8
d. 225 x 225 x 3
Correct Answer: b
Detailed Solution:
The layer accepts a volume of size W1×H1×D1. In our case, 228x228x3
Requires four hyperparameters: Number of filters K=8, their spatial extent F=5, the stride
S=1, the amount of padding P=0.
Produces a volume of size W2×H2×D2 i.e. 224x224x8 where: W2=(W1−F+2P)/S+1
=(228−5)/1+1 =224, H2=(H1−F+2P)/S+1 =(228−5)/1+1 =224, (i.e. width and height are
computed equally by symmetry), D2= Number of filters K=8.
____________________________________________________________________________
QUESTION 6:
Choose the correct statement in context of transfer learning
a. Higher layers learn task specific features, where as lower layers learn general
features
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
b. Transfer learning is generally used in situations when task specific dataset is very
less
c. The weights of lower layers of pretrained CNN (trained on dataset like ImageNet
etc) are copied and higher layers are random/gaussian initialized and entire
network is finetuned on smaller dataset
d. All of the above
Correct Answer: d
Detailed Solution:
Lower layers are more general features (for eg: can be edge detectors) and thus can be
transferred well to other task. Higher layers on the other hand are task specific. Transfer
learning is used in data scarce situations.
____________________________________________________________________________
QUESTION 7:
Advantage of ReLU over Sigmoid and TanH is
a. Low computational requirements
b. Alleviates vanishing gradient to some extent
c. Backpropagation is simpler
d. All of above
Correct Answer: d
Detailed Solution:
ReLU doesn’t saturates, is easier to compute and has simpler formulation of gradient with
respect to its input.
______________________________________________________________________________
QUESTION 8:
Statement 1: For a transfer learning task, lower layers are more generally transferred to
another task
Statement 2: For a transfer learning task, last few layers are more generally transferred to
another task
Which of the following option is correct?
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
a. Statement 1 is correct and Statement 2 is incorrect
b. Statement 1 is incorrect and Statement 2 is correct
c. Both Statement 1 and Statement 2 are correct
d. Both Statement 1 and Statement 2 are incorrect
Correct Answer: a
Detailed Solution:
Lower layers are more general features (for eg: can be edge detectors) and thus can be
transferred well to other task. Higher layers on the other hand are task specific.
______________________________________________________________________________
QUESTION 9:
Statement 1: Adding more hidden layers will solve the vanishing gradient problem for a 2-layer
neural network
Statement 2: Making the network deeper will increase the chance of vanishing gradients.
a. Statement 1 is correct
b. Statement 2 is correct
c. Neither Statement 1 nor Statement 2 is correct
d. Vanishing gradient problem is independent of number of hidden layers of the
neural network.
Correct Answer: b
Detailed Solution:
As more layers using certain activation functions are added to neural networks, the
gradients of the loss function approaches zero, making the network hard to train. Thus
statement 2 is correct.
____________________________________________________________________________
QUESTION 10:
How many convolution layers are there in a LeNet-5 architecture?
a. 2
b. 3
c. 4
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
d. 5
Correct Answer: a
Detailed Solution:
There are two convolutional layers and three fully connected layers in LeNet-5
architecture.
______________________________________________________________________
______________________________________________________________________________
************END*******