0% found this document useful (0 votes)

20 views2 pages

Reinforcement Learning - Unit 7 - Week 4

The document outlines the details of an NPTEL Reinforcement Learning course, including assignments, announcements, and course structure. It provides specific questions related to Markov Decision Processes (MDPs) and reinforcement learning concepts for students to answer as part of their coursework. The document also includes submission guidelines and deadlines for assignments and quizzes.

Uploaded by

Jatan Tandon

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views2 pages

Reinforcement Learning - Unit 7 - Week 4

Uploaded by

Jatan Tandon

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

X

(https://swayam.gov.in) abhishekkanade9102@gmail.com

(https://swayam.gov.in/nc_details/NPTEL)

NPTEL (https://swayam.gov.in/explorer?ncCode=NPTEL) » Reinforcement Learning (course)

Announcements (announcements) About the Course (preview) Q&A (forum) Progress (student/home) Mentor (student/mentor)

Review Assignment (assignment_review) Course Recommendations (/course_recommendations)


Click to register for
Certification exam

Week 4: Assignment 4
(https://examform.nptel.ac.in/2025_10/exam_form/dashboard)

Your last recorded submission was on 2025-08-19, 23:22 IST Due date: 2025-08-20, 23:59 IST.
If already registered, click
to check your payment 1) State True/False 1 point
status The state transition graph for any MDP is a directed acyclic graph.

True
False
Course outline 2) Consider the following statements: 1 point
(i) The optimal policy of an MDP is unique.
About NPTEL () (ii) We can determine an optimal policy for a MDP using only the optimal value function(𝑣 ∗ ), without accessing the MDP parameters.
(iii) We can determine an optimal policy for a given MDP using only the optimal q-value function(𝑞 ∗ ), without accessing the MDP parameters.
How does an NPTEL
online course work? () Which of these statements are false?

Only (ii)
Week 0 ()
Only (iii)
Only (i), (ii)
Week 1 ()
Only (i), (iii)
Only (ii), (iii)
Week 2 ()
3) Which of the following statements are true for a finite MDP? (Select all that apply). 1 point
Week 3 ()
The Bellman equation of a value function of a finite MDP defines a contraction in Banach space (using the max norm).
Week 4 ()
If 0 ≤ γ < 1 , then the eigenvalues of γ𝑃π are less than 1 .
MDP Modelling (unit? We call a normed vector space ’complete’ if Cauchy sequences exist in that vector space.
unit=42&lesson=43)
The sequence defined by 𝑣 𝑛 = 𝑟π + γ𝑃π 𝑣 𝑛−1 is a Cauchy sequence in Banach space (using the max norm).
Bellman Equation (unit?
(𝑃π is a stochastic matrix)
unit=42&lesson=44)
4) Which of the following is a benefit of using RL algorithms for solving MDPs? 1 point
Bellman Optimality
Equation (unit? They do not require the state of the agent for solving a MDP.
unit=42&lesson=45)
They do not require the action taken by the agent for solving a MDP.
Cauchy Sequence and They do not require the state transition probability matrix for solving a MDP.
Green's Equation (unit? They do not require the reward signal for solving a MDP.
unit=42&lesson=46)
5) Consider the following equations: 1 point
Banach Fixed Point
Theorem (unit?
unit=42&lesson=47)
(i) 𝑣 π (𝑠) = 𝔼π [∑∞ γ𝑖−𝑡 𝑅𝑖+1 |𝑆𝑡 = 𝑠]
𝑖=𝑡
(ii) 𝑞 π (𝑠, 𝑎) = ∑𝑠′ 𝑝(𝑠 ′ |𝑠, 𝑎)𝑣 π (𝑠 ′ )
Convergence Proof (unit? (iii) 𝑣 π (𝑠) = ∑𝑎 π(𝑎|𝑠)𝑞 π (𝑠, 𝑎)
unit=42&lesson=48)

Week 4 Feedback Form : Which of the above are correct?

Reinforcement Learning
(unit?unit=42&lesson=237)
Only (i)
Only (i), (ii)
Practice: Week 4 : Only (ii), (iii)
Assignment 4(Non Graded)
Only (i), (iii)
(assessment?name=288)
(i), (ii), (iii)
Quiz: Week 4:
Assignment 4 6) What is true about the γ (discount factor) in reinforcement learning? 1 point
(assessment?name=289)
Discount factor can be any real number
Week 5 ()
The value of γ cannot aﬀect the optimal policy
The lower the value of gamma, the more myopic the agent gets, i.e the agent maximises rewards that it receives over a shorter horizon
DOWNLOAD VIDEOS ()
7) Consider the following statements for a finite MDP (𝐼 is an identity matrix with dimensions |𝑆| × |𝑆|(𝑆 is the set of all states) and 𝑃π 1 point
NPTEL Resources () is a stochastic matrix):
(i) MDP with stochastic rewards may not have a deterministic optimal policy.
(ii) There can be multiple optimal stochastic policies.
(iii) If 0 ≤ γ < 1 , then rank of the matrix 𝐼 − γ𝑃π is equal to |𝑆| .
(iv) If 0 ≤ γ < 1 , then rank of the matrix 𝐼 − γ𝑃π is less than |𝑆| .
Which of the above statements are true?

Only (ii), (iii)

Only (ii), (iv)
Only (i), (iii)
Only (i), (ii), (iii)

8) Consider an MDP with 3 states 𝐴, 𝐵, 𝐶 . At each state we can go to either of the two states. i.e if we are in state 𝐴 then we can 1 point
perform 2 actions, going to state 𝐵 or 𝐶 . The rewards for each transactions are 𝑟(𝐴, 𝐵) = −3 (reward if we go from 𝐴 to 𝐵 ), 𝑟(𝐵, 𝐴) = −1 ,
𝑟(𝐵, 𝐶) = 8 , 𝑟(𝐶, 𝐵) = 4 , 𝑟(𝐴, 𝐶) = 0, 𝑟(𝐶, 𝐴) = 5, discount factor is 0.9 . Find the fixed point of the value function for the policy π(𝐴) = 𝐵 (if
we are in state 𝐴 we choose the action to go to 𝐵 ) π(𝐵) = 𝐶, π(𝐶) = 𝐴. 𝑣 π ([𝐴𝐵𝐶]) =? (round to 1 decimal place)

[20.6, 21.8, 17.6]

[30.4, 44.2, 32.4]
[30.4, 37.2, 32.4]
[21.6, 21.8, 17.6]

9) Which of the following is not a valid norm function? (𝑥 is a 𝐷 dimensional vector) 1 point

max𝑑∈{1,…,𝐷} |𝑥𝑑 |

‾‾‾‾‾‾‾2
√Σ𝑑=1 𝑥𝑑
𝐷

min 𝑑∈{1,…,𝐷} |𝑥𝑑 |

Σ𝐷
𝑑=1 |𝑥𝑑 |

10) For an operator 𝐿 , which of the following properties must be satisfied by 𝑥 for it to be a fixed point for 𝐿 ?(Multi-Correct) 1 point

𝐿𝑥 = 𝑥

𝐿2 𝑥 = 𝑥
∀λ > 0𝐿𝑥 = λ𝑥
None of the above

You may submit any number of times before the due date. The final submission will be considered for grading.
Submit Answers

Reinforcement Learning - Unit 6 - Week 4
0% (1)
Reinforcement Learning - Unit 6 - Week 4
3 pages
Reinforcement Learning - Unit 7 - Week 4
No ratings yet
Reinforcement Learning - Unit 7 - Week 4
3 pages
RL-solution 4
No ratings yet
RL-solution 4
4 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
101 pages
402 Lec20
No ratings yet
402 Lec20
21 pages
Assignment 4: Reinforcement Learning Prof. B. Ravindran
No ratings yet
Assignment 4: Reinforcement Learning Prof. B. Ravindran
4 pages
Introduction To Machine Learning - Unit 15 - Week 12
No ratings yet
Introduction To Machine Learning - Unit 15 - Week 12
3 pages
Lecture Notes v1.0 687 F22
No ratings yet
Lecture Notes v1.0 687 F22
115 pages
DRL Homework 1
No ratings yet
DRL Homework 1
4 pages
AI 3000 / CS5500: Reinforcement Learning Exam 1: Instructions
0% (1)
AI 3000 / CS5500: Reinforcement Learning Exam 1: Instructions
4 pages
Quiz2 Sol
No ratings yet
Quiz2 Sol
4 pages
Reinforcement Learning Exam
No ratings yet
Reinforcement Learning Exam
6 pages
Reinforcement Learning in A Nutshell
No ratings yet
Reinforcement Learning in A Nutshell
12 pages
Lecture 12 Slides - After
No ratings yet
Lecture 12 Slides - After
50 pages
l1 Mdps Exact Methods
No ratings yet
l1 Mdps Exact Methods
69 pages
RL 10 QUESTIONS FOR MID II Scheme of Evaluvation
No ratings yet
RL 10 QUESTIONS FOR MID II Scheme of Evaluvation
15 pages
Introduction To Machine Learning - Unit 15 - Week 12
No ratings yet
Introduction To Machine Learning - Unit 15 - Week 12
3 pages
Reinforcement Learning Assignment
No ratings yet
Reinforcement Learning Assignment
4 pages
RL-UNIT2 - RL Unit 2 RL-UNIT2 - RL Unit 2
No ratings yet
RL-UNIT2 - RL Unit 2 RL-UNIT2 - RL Unit 2
23 pages
Reinforcement Learning - Unit 13 - Week 10
No ratings yet
Reinforcement Learning - Unit 13 - Week 10
3 pages
Class Notes 2
No ratings yet
Class Notes 2
6 pages
Tut21 RL
No ratings yet
Tut21 RL
101 pages
RL Theory Tutorial
No ratings yet
RL Theory Tutorial
80 pages
Lecture13 Postclass
No ratings yet
Lecture13 Postclass
36 pages
RL Solution3
No ratings yet
RL Solution3
4 pages
RL 2021 22 Exam I
No ratings yet
RL 2021 22 Exam I
4 pages
databookRL Steve Brunton PDF
100% (1)
databookRL Steve Brunton PDF
76 pages
Practice Problem Set 3 IE - 708 - MDP - July24
No ratings yet
Practice Problem Set 3 IE - 708 - MDP - July24
3 pages
Cs229-Notes12 Reinforcement in Control
No ratings yet
Cs229-Notes12 Reinforcement in Control
17 pages
CSE2530 Reinforcement Learning 2025 P1+2
No ratings yet
CSE2530 Reinforcement Learning 2025 P1+2
115 pages
2025 - MDPs 1
No ratings yet
2025 - MDPs 1
62 pages
A12 Spring2024
No ratings yet
A12 Spring2024
5 pages
Lecture 3 - MDPs and Dynamic Programming
No ratings yet
Lecture 3 - MDPs and Dynamic Programming
62 pages
RL Class Notes
No ratings yet
RL Class Notes
68 pages
Reinforcement Learning Basics
No ratings yet
Reinforcement Learning Basics
7 pages
Reinforcement Learning and Control: CS229 Lecture Notes
No ratings yet
Reinforcement Learning and Control: CS229 Lecture Notes
15 pages
16 RL
No ratings yet
16 RL
51 pages
Advanced MBRL for Efficient Control
No ratings yet
Advanced MBRL for Efficient Control
17 pages
MDP Cheatsheet
No ratings yet
MDP Cheatsheet
3 pages
RL Problem Sheet: E0 270: Machine Learning (Spring 2025)
No ratings yet
RL Problem Sheet: E0 270: Machine Learning (Spring 2025)
10 pages
Reinforcement Learning Question Bank
No ratings yet
Reinforcement Learning Question Bank
11 pages
Lec17 ReinforcementLearning
No ratings yet
Lec17 ReinforcementLearning
58 pages
Unit-5 Ai
No ratings yet
Unit-5 Ai
19 pages
Lecture Notes RL
No ratings yet
Lecture Notes RL
14 pages
Lecture 30 Reinforcement-Learning
No ratings yet
Lecture 30 Reinforcement-Learning
50 pages
Reinforcement Learning: Amulya Viswambaran (202090007) Kehkashan Fatima (202090202) Sruthi Krishnan (202090333)
No ratings yet
Reinforcement Learning: Amulya Viswambaran (202090007) Kehkashan Fatima (202090202) Sruthi Krishnan (202090333)
40 pages
Powell UnifiedFrameworkStochasticOptimization Jan292018
No ratings yet
Powell UnifiedFrameworkStochasticOptimization Jan292018
69 pages
Lecture 3 - MDPs and Dynamic Programming
No ratings yet
Lecture 3 - MDPs and Dynamic Programming
66 pages
RL Exam Tutti
No ratings yet
RL Exam Tutti
47 pages
RL - Exam2023 Solved
No ratings yet
RL - Exam2023 Solved
6 pages
Unit 5 Reinforcement Learning Notes
No ratings yet
Unit 5 Reinforcement Learning Notes
20 pages
Lec 08
No ratings yet
Lec 08
59 pages
Reinforcement Learning Basics
No ratings yet
Reinforcement Learning Basics
169 pages
Batch Reinforcement Learning: Alan Fern
No ratings yet
Batch Reinforcement Learning: Alan Fern
47 pages
ML Unit-4 - RTU
No ratings yet
ML Unit-4 - RTU
18 pages
Reinforcement Learning Basics
No ratings yet
Reinforcement Learning Basics
51 pages
کتاب هشتم بارگزاری شده
No ratings yet
کتاب هشتم بارگزاری شده
112 pages
5 - Policy Gradient Methods
No ratings yet
5 - Policy Gradient Methods
57 pages
Cloud Computing - Unit 7 - Week 4
No ratings yet
Cloud Computing - Unit 7 - Week 4
3 pages
Proposal - CyberSheild Hackathon
No ratings yet
Proposal - CyberSheild Hackathon
7 pages
Writeup
No ratings yet
Writeup
13 pages
Content Addressable Networks CAN
No ratings yet
Content Addressable Networks CAN
8 pages
ML Assgn Logistic Wine Quality - Ipynb - Colab
No ratings yet
ML Assgn Logistic Wine Quality - Ipynb - Colab
5 pages
Assgn 04 ML Jatan - Colab
No ratings yet
Assgn 04 ML Jatan - Colab
4 pages
Study On Modal and Harmonic Response Analysis
No ratings yet
Study On Modal and Harmonic Response Analysis
7 pages
CSM51235DC
No ratings yet
CSM51235DC
14 pages
Refractory Material COTS Vendor Survey-AppendixB
No ratings yet
Refractory Material COTS Vendor Survey-AppendixB
6 pages
Nature of Waves
No ratings yet
Nature of Waves
6 pages
GATE Chemical Heat Transfer-HT
No ratings yet
GATE Chemical Heat Transfer-HT
18 pages
Students' Reasoning in Fluid Dynamics: Bernoulli'S Principle vs. The Continuity Equation
No ratings yet
Students' Reasoning in Fluid Dynamics: Bernoulli'S Principle vs. The Continuity Equation
8 pages
Predictive Tools and Innovations in The Ventilation of Cooling Beds
No ratings yet
Predictive Tools and Innovations in The Ventilation of Cooling Beds
8 pages
Industrial Laser Workstation Guide
No ratings yet
Industrial Laser Workstation Guide
2 pages
Zhang 2016
No ratings yet
Zhang 2016
21 pages
Internal Forced Convection: "!PROBLEM 8-54"
No ratings yet
Internal Forced Convection: "!PROBLEM 8-54"
10 pages
A Phase-Field Formulation For Dynamic Cohesive Fracture
No ratings yet
A Phase-Field Formulation For Dynamic Cohesive Fracture
32 pages
Drainage Area Well Spacing
100% (1)
Drainage Area Well Spacing
8 pages
GRADE 7 SCIENCE WORKSHEET (1) For Dumbwits
No ratings yet
GRADE 7 SCIENCE WORKSHEET (1) For Dumbwits
3 pages
27 February Bronze Class
No ratings yet
27 February Bronze Class
6 pages
Mikrokator Measuring Instruments Guide
No ratings yet
Mikrokator Measuring Instruments Guide
4 pages
1st Lesson OL DIASS
No ratings yet
1st Lesson OL DIASS
6 pages
Nonlinear Fiber Optics
No ratings yet
Nonlinear Fiber Optics
16 pages
773 715
No ratings yet
773 715
43 pages
Plane Wave Propagation
No ratings yet
Plane Wave Propagation
5 pages
ADP 2 Lab Manual
No ratings yet
ADP 2 Lab Manual
102 pages
Fluid Mechanics for Engineers
No ratings yet
Fluid Mechanics for Engineers
474 pages
Chapter 1
No ratings yet
Chapter 1
12 pages
Yogashastra5 Children Yoga (Bks Iyenger Marathi) .
No ratings yet
Yogashastra5 Children Yoga (Bks Iyenger Marathi) .
211 pages
Effect of Sintering Temperature On The Structural and Magnet - 2016 - Ceramics I
No ratings yet
Effect of Sintering Temperature On The Structural and Magnet - 2016 - Ceramics I
7 pages
Experiment-1. Compound Microscope
No ratings yet
Experiment-1. Compound Microscope
5 pages
Humidity Measurement Techniques
No ratings yet
Humidity Measurement Techniques
24 pages
Altermagnetism Review
No ratings yet
Altermagnetism Review
29 pages
Maria Assumpta Convent Sr. Sec. School, Kashipur SESSION 2020-21 Term - 1 Syllabus Class - X
No ratings yet
Maria Assumpta Convent Sr. Sec. School, Kashipur SESSION 2020-21 Term - 1 Syllabus Class - X
2 pages
Cambridge International AS & A Level: Physics 9702/33 March 2020
No ratings yet
Cambridge International AS & A Level: Physics 9702/33 March 2020
8 pages
Full Text 01
No ratings yet
Full Text 01
115 pages

Reinforcement Learning - Unit 7 - Week 4

Uploaded by

Reinforcement Learning - Unit 7 - Week 4

Uploaded by

X

NPTEL (https://swayam.gov.in/explorer?ncCode=NPTEL) » Reinforcement Learning (course)

Review Assignment (assignment_review) Course Recommendations (/course_recommendations)

Week 4 Feedback Form : Which of the above are correct?

Only (ii), (iii)

[20.6, 21.8, 17.6]

min 𝑑∈{1,…,𝐷} |𝑥𝑑 |

You might also like