Assessment submitted.
goyalananya2002@gmail.com
X
(https://swayam.gov.in)
(https://swayam.gov.in/nc_details/NPTEL)
NPTEL (https://swayam.gov.in/explorer?ncCode=NPTEL) » Reinforcement Learning (course)
If already
registered, click
Thank you for taking the Week 11 :
to check your
payment status
Assignment 11.
Course Week 11 : Assignment 11
outline Your last recorded submission was on 2024-04-02, 23:56 IST Due date: 2024-04-10, 23:59 IST.
1) Which of the following option is correct for the sub-task terminations in the MAXQ 1 point
About NPTEL
Frame-work?
()
The termination is stochastic
How does an
The termination is deterministic
NPTEL online
course work?
() 2) In MAXQ learning, we have a collection of SMDPs. In conventional value function, the 1 point
only argument was state. In MAXQ value function decomposition, we have value function of the form
Week 1 () V
π
, where π is the policy, s is the current state. What is ′ i′ supposed to be in the above
(i, s)
notation?
Week 2 ()
The number of times we have visited state s
Week 3 ()
It means it is ith iteration of updates
Week 4 () i is the identity of the sub-task/SMDP.
None of the above.
Week 5 ()
Week 6 () Comprehensive model for question 3 to question 6
Consider the following taxi-world problem. The grey colored cell are inaccessible cells or can be
Week 7 () thought of obstacles. The corner cells marked as R, G, B, Y are allowed pickup-drop points for
passengers.
Week 8 ()
Week 9 ()
Week 10 ()
Week 11 ()
MAXQ (unit?
unit=103&lesso
n=104)
MAXQ Value Say following is the Call-Graph for the above Taxi-World problem.
Assessment submitted.
Function
X Decomposition
(unit?
unit=103&lesso
n=105)
Option
Discovery (unit?
unit=103&lesso
n=106)
Week 11
Feedback Form
: Reinforcement
Learning (unit?
unit=103&lesso
n=108)
Practice: Week
11 : Assignment
11(Non Graded) 3) From the below list of actions: 1 point
(assessment? i. Left
name=198) ii. Drop off
iii. Navigate
Quiz: Week 11
iv. put-down
: Assignment
11
Which among them are the primitive actions?
(assessment?
i, ii, iii, iv
name=214)
ii, iii
DOWNLOAD i, iv
VIDEOS ()
None of the above
Text
Transcripts () 4) From the discussion in the class, it is said that Navigate is not a single sub-task. What is 1 point
the parameter 't' in 'N avigate(t) ' from the class discussions?
Problem
the number of times ’Pick up’ or ’Drop off’ have called sub-task Navigate
Solving
Session - Jan the maximum number of primitive actions permitted to finish sub-task
2024 () the destination (in this case, one of R, G, B, Y)
None of the above
5) State True/False. The ordering of the above call-graph is important and sub-tasks should 1 point
be performed via these orderings.
True
False
6) Suppose the passenger is always either inside the taxi or at one of the four 1 point
pickup/dropoff locations. That means there are 5 states for passenger’s location. Then for the given
taxi-world, what is the number of states that suffices to define all information?
18
∗
18 5
∗
18 5∗4
None of the above
7) State True/False. Bottlenecks are useful surrogative measures for option discovery. 1 point
True
Assessment submitted.
X False
8) Which of the following can be considered as a good option in Hierarchical RL? 1 point
An option that can be reused often
An option that can cut down exploration
An option that helps in transfer learning
None of the above
9) We define the action value for MAXQ as q π (i, s, a) = v
π
(a, s) + C
π
(i, s, a) where 1 point
q
π
(i, s, a) can be interpreted as expected return when you are in sub-task i, and state s , and you
decide to perform sub-task a. Assume that in taking a, you get reward r1 , and after completion of a,
you get reward r2 in completing sub-task i. Choose the correct value of C π (i, s, a) from following.
π
C (i, s, a) = r2
π
C (i, s, a) = r1 + r2
π
C (i, s, a) = r1
None of the above
10) In the MAXQ approach to solving a problem, suppose that sub-task Mi invokes sub-task 1 point
Mj . Do the pseudo rewards of Mj have any effect on sub-task Mi ?
Yes
No
You may submit any number of times before the due date. The final submission will be considered for
grading.
Submit Answers