Course: ELEC-E8125 - Reinforcement learning D, Lecture, 4.9.2023-29.11.2023

Topic outline

General

The course provides an overview of mathematical models and algorithms behind optimal decision making in time-series systems. The course focus is on optimal decision making and control, reinforcement learning, and decision making under uncertainty.

Practical matters

Lecturer: Joni Pajarinen.

Teaching assistants (TAs): Aidan Scannell, Vivienne Wang, Wenyan Yang, Mohammadreza Nakhaei, Yuying Zhang, Wenshuai Zhao, Nikita Kostin, Yi Zhao, Taha Heidari

The reinforcement learning lecture will be organized as follows
- Location: Maarintie 8, AS1
- Time: Tuesdays 14:15-16:00 (Period I, II). Note!: The first lecture is on Monday 4.9.2023 at 12:15 - 14:00 in room T1 (Computer Science building)
- Although in person participation is encouraged for the full lecture experience lectures will be also recorded and can be watched afterwards
Grading Scale: 0-5
- 7 individual assignments (60%)
- 1 project work, in groups (max. 2 students) (20%)
- Quizzes (due before lecture) (20 %)
Exercise sessions will be given twice a week. Attendance is optional.
- Mondays 12:15–14:00, 11.9-20.11.2023, Maarintie 8, TU3,
- Wednesdays 10:15–12:00, 6.9.–29.11.2023, Maarintie 8, AS3 Saab Space
Please join the Zulip link to receive the latest updates and ask questions about the exercises. Please use your Aalto account for registering to Zulip. Notice that, we will use the Zulip channel as the main place to answer questions about the exercises.
Each Student has 3 days in total for late submissions.

Schedule

Lecture Schedule


Week	Lecture	Lecture_Date	Reading
W36	L1 Course Overview	Mon, 4.9	no readings
W37	L2 Markov decision processes	Tue, 12.9	Sutton & Barto, chapters 2-2.3, 2.5-2.6, 3-3.8
W38	L3 RL in discrete domains	Tue, 19.9	Sutton & Barto Ch. 5-5.4, 5.6, 6-6.5
W39	L4 Function approximation	Tue, 26.9	Sutton & Barto Ch. 9-9.3, 10-10.1
W40	L5 Policy gradient	Tue, 3.10	Sutton & Barto, Ch. 13-13.3
W41	L6 Actor-critic	Tue 10.10	Sutton & Barto, Ch. 13.5, 13.7
W42	No Lecture	Tue 17.10
W43	L7 Model-based RL	Tue 24.10	Sutton & Barto, Ch. 8 - 8.2
W44	L8 Interleaved learning and planning	Tue 31.10	Sutton & Barto, Ch. 8 - 8.2
W45	L9 Exploration and exploitation	Tue 7.11	1) Sutton & Barto, Ch. 2.7, 8.9 - 8.11 and 2) Russo, D. J., Van Roy, B., Kazerouni, A., Osband, I., & Wen, Z. (2018). A tutorial on Thompson sampling. Foundations and Trends in Machine Learning, 11(1), 1-96. https://web.stanford.edu/~bvr/pubs/TS_Tutorial_FnT.pdf Section 2, 3, 4
W46	L10 (Guest lecture by Mohammadreza Nakhaei) Offline RL, introduction, methods, and challenges	Tue 14.11
W47	L11 Partially observable MDPs	Tue 21.11	1) Anthony Cassandra, POMDP tutorial, http://www.pomdp.org/tutorial/, steps from "Brief Introduction to MDPs" until "Background on POMDPs" and 2) Partially Observable Markov Decision Processes in Robotics: A Survey. https://arxiv.org/pdf/2209.10342 Sections II.A, III.B, III.C
W48	L12 (Guest lecture by Atanu Mazumdar) Multi-objective Reinforcement Learning	Tue 28.11

Quiz Schedule

Quiz	Release	Deadline (always before the lecture)
Quiz 1	Sep 5	Sep 12
Quiz 2	Sep 12	Sep 19
Quiz 3	Sep 19	Sep 26
Quiz 4	Sep 26	Oct 3
Quiz 5	Oct 3	Oct 10
Quiz 6	Oct 10	Oct 24
Quiz 7	Oct 24	Nov 7
Quiz 8	Nov 7	Nov 21

Exercise & Project Schedule

Exercises & Project	Release	Deadline
Exercise 1	Sep 5	Sep 18 @23:59
Exercise 2	Sep 13	Sep 25 @23:59
Exercise 3	Sep 20	Oct 2 @23:59
Exercise 4	Sep 27	Oct 9 @23:59
Exercise 5	Oct 4	Oct 23 @23:59
Exercise 6	Oct 11	Nov 6 @23:59
Exercise 7	Oct 25	Nov 20 @23:59
Project	Oct 30	Dec 4 @23:59

Who to contact

Usually, if you need help with the exercises or project work, you can put your questions in the corresponding Zulip channel or attend the exercise session. But if you need to contact TAs in person, here is the list:

Ex/Proj	TAs
Ex1	Wenyan, Mohammadreza
Ex2	Wenshuai, Yi
Ex3	Nikita, Vivienne
Ex4	Wenyan, Yuying
Ex5	Nikita, Vivienne
Ex6	Wenshuai, Yuying
Ex7	Mohammadreza, Nikita
Proj	Taha, Wenshuai, Wenyan

If you have other questions (such as military service, etc), you can directly contact Prof. Joni Pajarinen.

Select activity Announcements

Announcements Forum

ELEC-E8125 - Reinforcement learning D, Lecture, 4.9.2023-29.11.2023