ELEC-E8125 - Reinforcement learning D, Lecture, 4.9.2023-29.11.2023
This course space end date is set to 29.11.2023 Search Courses: ELEC-E8125
Topic outline
-
The course provides an overview of mathematical models and algorithms behind optimal decision making in time-series systems. The course focus is on optimal decision making and control, reinforcement learning, and decision making under uncertainty.
Practical matters
Lecturer: Joni Pajarinen.
Teaching assistants (TAs): Aidan Scannell, Vivienne Wang, Wenyan Yang, Mohammadreza Nakhaei, Yuying Zhang, Wenshuai Zhao, Nikita Kostin, Yi Zhao, Taha Heidari
- The reinforcement learning lecture will be organized as follows
- Location: Maarintie 8, AS1
- Time:
Tuesdays 14:15-16:00 (Period I, II). Note!: The first lecture is on
Monday 4.9.2023 at 12:15 - 14:00 in room T1 (Computer Science building)
- Although in person participation is encouraged for the full lecture experience lectures will be also recorded and can be watched afterwards
- Grading Scale: 0-5
- 7 individual assignments (60%)
- 1 project work, in groups (max. 2 students) (20%)
- Quizzes (due before lecture) (20 %)
- Exercise sessions will be given twice a week. Attendance is optional.
- Mondays 12:15–14:00, 11.9-20.11.2023, Maarintie 8, TU3,
- Wednesdays 10:15–12:00, 6.9.–29.11.2023, Maarintie 8, AS3 Saab Space
- Please join the Zulip link to receive the latest updates and ask questions about the exercises. Please use your Aalto account for registering to Zulip. Notice that, we will use the Zulip channel as the main place to answer questions about the exercises.
- Each Student has 3 days in total for late submissions.
Schedule
Lecture Schedule
Week Lecture Lecture_Date Reading W36 L1 Course Overview Mon, 4.9 no readings W37 L2 Markov decision processes Tue, 12.9 Sutton & Barto, chapters 2-2.3, 2.5-2.6, 3-3.8 W38 L3 RL in discrete domains Tue, 19.9 Sutton & Barto Ch. 5-5.4, 5.6, 6-6.5 W39 L4 Function approximation Tue, 26.9 Sutton & Barto Ch. 9-9.3, 10-10.1 W40 L5 Policy gradient Tue, 3.10 Sutton & Barto, Ch. 13-13.3 W41 L6 Actor-critic Tue 10.10 Sutton & Barto, Ch. 13.5, 13.7 W42 No Lecture Tue 17.10 W43 L7 Model-based RL Tue 24.10 Sutton & Barto, Ch. 8 - 8.2 W44 L8 Interleaved learning and planning Tue 31.10 Sutton & Barto, Ch. 8 - 8.2 W45 L9 Exploration and exploitation Tue 7.11 1) Sutton & Barto, Ch. 2.7, 8.9 - 8.11 and 2) Russo, D. J., Van Roy, B., Kazerouni, A., Osband, I., & Wen, Z. (2018). A tutorial on Thompson sampling. Foundations and Trends in Machine Learning, 11(1), 1-96. https://web.stanford.edu/~bvr/pubs/TS_Tutorial_FnT.pdf Section 2, 3, 4 W46 L10 (Guest lecture by Mohammadreza Nakhaei) Offline RL, introduction, methods, and challenges Tue 14.11 W47 L11 Partially observable MDPs Tue 21.11 1) Anthony Cassandra, POMDP tutorial, http://www.pomdp.org/tutorial/, steps from "Brief Introduction to MDPs" until "Background on POMDPs" and 2) Partially Observable Markov Decision Processes in Robotics: A Survey. https://arxiv.org/pdf/2209.10342 Sections II.A, III.B, III.C W48 L12 (Guest lecture by Atanu Mazumdar) Multi-objective Reinforcement Learning Tue 28.11 Quiz Schedule
Quiz Release Deadline (always before the lecture) Quiz 1 Sep 5 Sep 12 Quiz 2 Sep 12 Sep 19 Quiz 3 Sep 19 Sep 26 Quiz 4 Sep 26 Oct 3 Quiz 5 Oct 3 Oct 10 Quiz 6 Oct 10 Oct 24 Quiz 7 Oct 24 Nov 7 Quiz 8 Nov 7 Nov 21 Exercise & Project Schedule
Exercises & Project Release Deadline Exercise 1 Sep 5 Sep 18 @23:59 Exercise 2 Sep 13 Sep 25 @23:59 Exercise 3 Sep 20 Oct 2 @23:59 Exercise 4 Sep 27 Oct 9 @23:59 Exercise 5 Oct 4 Oct 23 @23:59 Exercise 6 Oct 11 Nov 6 @23:59 Exercise 7 Oct 25 Nov 20 @23:59 Project Oct 30 Dec 4 @23:59 Who to contact
Usually, if you need help with the exercises or project work, you can put your questions in the corresponding Zulip channel or attend the exercise session. But if you need to contact TAs in person, here is the list:
Ex/Proj TAs Ex1 Wenyan, Mohammadreza Ex2 Wenshuai, Yi Ex3 Nikita, Vivienne Ex4 Wenyan, Yuying Ex5 Nikita, Vivienne Ex6 Wenshuai, Yuying Ex7 Mohammadreza, Nikita Proj Taha, Wenshuai, Wenyan If you have other questions (such as military service, etc), you can directly contact Prof. Joni Pajarinen.
- The reinforcement learning lecture will be organized as follows