Topic outline

  • Learning outcomes

    The seminar will introduce students to a selected area of the algorithms of deep Reinforcement Learning. At the end of the seminar course, the student will be able to

    • Understand the advanced Reinforcement Learning algorithms.
    • Understand the application areas.
    • Find a preference algorithm to solve Reinforcement Learning problems. 


    • Basic knowledge of reinforcement learning
    • Familiar with supervised learning methods
    • Familiar with basic matrix algebra and optimization algorithms
    • Familiar with deep neural networks
    • Familiar with fundamentals of control-theory
    • Familiar with Python and its libraries


     Lectures and presentations will be given via Zoom every Wednesday from 10.00-12.00. The link is the same for each seminar:


    • The grading scale is pass/fail. 
    • Participation in every seminar is compulsory. 
    • One page (250-300 words) summary of each session.
    • One presentation based on the paper listed or can choose a related topic.
    • Become an opponent in at least one session.
    • Grading assessment depends on technical correctness, writing quality, and language.


    • Total teaching hours 12 hours
    • Independent study 15 hours
    • Preparation to presentation work, reading 15 pages, 30~40h
    • Written work 15 hours (5 sessions  x ~3 h)

    Listed papers

    1. DQN [Mnih et al., 2013] 
    2. Double DQN [Van Hasselt et al., 2016]
    3. PER [Schaul et al., 2015]
    4. QT-OPT [Kalashnikov et al., 2018]
    5. AlphaGO [Silver et al., 2016]
    6. TRPO [Schulman et al., 2015]
    7. PPO [Schulman et al., 2017]
    8. Deep Dyna-Q [Peng et al., 2018]
    9. SAC [Haarnoja et al., 2018]
    10. DDPG [Lillicrap et al., 2015]
    11. I2A [Racaniere et al., 2017]
    12. Inverse RL [Choi and Kim, 2012]
    13. MBPO [Janner et al., 2019]
    14. CQL [Kumar et al., 2020]

    Sign-up for the topic and reading resources: Link