Topic outline

  • The course provides an overview of mathematical models and algorithms behind optimal decision making in time-series systems. The course focus is on optimal decision making and control, reinforcement learning, and decision making under uncertainty.

    Practical matters

    Lecturer: Joni Pajarinen.

    Teaching assistants (TAs):  Aidan Scannell, Vivienne Wang, Wenyan Yang, Mohammadreza Nakhaei, Yuying Zhang, Wenshuai Zhao, Nikita Kostin, Yi Zhao, Taha Heidari

    • The reinforcement learning lecture will be organized as follows
      • Location: Maarintie 8, AS1
      • Time: Tuesdays 14:15-16:00 (Period I, II). Note!: The first lecture is on Monday 4.9.2023 at 12:15 - 14:00 in room T1 (Computer Science building)
      • Although in person participation is encouraged for the full lecture experience lectures will be also recorded and can be watched afterwards
    • Grading Scale: 0-5
      • individual assignments (60%)
      • 1 project work, in groups (max. 2 students) (20%)
      • Quizzes (due before lecture) (20 %)
    • Exercise sessions will be given twice a week. Attendance is optional.
      • Mondays 12:15–14:00, 11.9-20.11.2023,  Maarintie 8, TU3, 
      • Wednesdays 10:15–12:00, 6.9.–29.11.2023, Maarintie 8, AS3 Saab Space
    • Please join the Zulip link to receive the latest updates and ask questions about the exercises.  Please use your Aalto account for registering to Zulip. Notice that, we will use the Zulip channel as the main place to answer questions about the exercises.
    • Each Student has 3 days in total for late submissions.


    Schedule

    Lecture Schedule

     Week  Lecture Lecture_Date  Reading
    W36L1 Course Overview  Mon, 4.9no readings
    W37L2 Markov decision processes  Tue, 12.9Sutton & Barto, chapters 2-2.3, 2.5-2.6, 3-3.8
    W38L3 RL in discrete domains  Tue, 19.9Sutton & Barto Ch. 5-5.4, 5.6, 6-6.5
    W39L4 Function approximation  Tue, 26.9Sutton & Barto Ch. 9-9.3, 10-10.1
    W40L5 Policy gradient  Tue, 3.10Sutton & Barto, Ch. 13-13.3
    W41L6 Actor-critic  Tue 10.10Sutton & Barto, Ch. 13.5, 13.7
    W42No Lecture  Tue 17.10
    W43L7 Model-based RL  Tue 24.10Sutton & Barto, Ch. 8 - 8.2
    W44L8 Interleaved learning and planning  Tue 31.10Sutton & Barto, Ch. 8 - 8.2
    W45L9 Exploration and exploitation  Tue 7.111) Sutton & Barto, Ch. 2.7, 8.9 - 8.11 and 2) Russo, D. J., Van Roy, B., Kazerouni, A., Osband, I., & Wen, Z. (2018). A tutorial on Thompson sampling. Foundations and Trends in Machine Learning, 11(1), 1-96. https://web.stanford.edu/~bvr/pubs/TS_Tutorial_FnT.pdf Section 2, 3, 4
    W46L10 (Guest lecture by Mohammadreza Nakhaei) Offline RL, introduction, methods, and challenges  Tue 14.11
    W47L11 Partially observable MDPs  Tue 21.111) Anthony Cassandra, POMDP tutorial, http://www.pomdp.org/tutorial/, steps from "Brief Introduction to MDPs" until "Background on POMDPs" and 2) Partially Observable Markov Decision Processes in Robotics: A Surveyhttps://arxiv.org/pdf/2209.10342 Sections II.A, III.B, III.C
    W48L12 (Guest lecture by Atanu Mazumdar) Multi-objective Reinforcement Learning  Tue 28.11

    Quiz Schedule

    Quiz     
    Release       Deadline (always before the lecture)
    Quiz 1 Sep 5 Sep 12
    Quiz 2 Sep 12 Sep 19
    Quiz 3 Sep 19
    Sep 26
    Quiz 4 Sep 26 Oct 3
    Quiz 5 Oct 3
    Oct 10
    Quiz 6 Oct 10 Oct 24
    Quiz 7 Oct 24
    Nov 7
    Quiz 8
    Nov 7Nov 21


    Exercise & Project Schedule

    Exercises & Project       Release       Deadline
    Exercise 1 Sep 5 Sep 18 @23:59
    Exercise 2 Sep 13 Sep 25 @23:59
    Exercise 3 Sep 20 Oct 2 @23:59
    Exercise 4 Sep 27 Oct 9 @23:59
    Exercise 5 Oct 4 Oct 23 @23:59
    Exercise 6 Oct 11 Nov 6 @23:59
    Exercise 7 Oct 25 Nov 20 @23:59
    Project Oct 30
    Dec 4 @23:59

    Who to contact

    Usually, if you need help with the exercises or project work, you can put your questions in the corresponding Zulip channel or attend the exercise session. But if you need to contact TAs in person, here is the list:

    Ex/Proj    TAs
    Ex1Wenyan, Mohammadreza
    Ex2Wenshuai, Yi
    Ex3Nikita, Vivienne
    Ex4Wenyan, Yuying
    Ex5Nikita, Vivienne
    Ex6Wenshuai, Yuying
    Ex7Mohammadreza, Nikita
    ProjTaha, Wenshuai, Wenyan

    If you have other questions (such as military service, etc), you can directly contact Prof. Joni Pajarinen.