ELEC-E8125 - Reinforcement learning D, Lecture, 13.9.2021-8.12.2021
This course space end date is set to 08.12.2021 Search Courses: ELEC-E8125
Topic outline
-
Overview
The course has a final project to apply the knowledge gathered throughout the course to a specific problem.
Course Project
In the project work, we will implement and apply some more advanced RL algorithms in continuous control tasks. The project work includes two parts. First, two vastly used reinforcement learning algorithms, TD3 (https://arxiv.org/pdf/1802.09477.pdf) and PPO (https://arxiv.org/abs/1707.06347), will be implemented. For this part, we will offer the base code so you can start easily. After finishing this part, you can train a policy to balance a cart pole and to control an halfcheetah running forward.In the second part, you need to read some research papers and implement their proposed algorithms based on the code finished in Part I. The candidate algorithms in Part II include- MBPO (https://arxiv.org/abs/1906.08253): How to use the learned dynamic model to improve sample efficiency?
- REDQ (https://arxiv.org/abs/2101.05982): How to significantly improve sample efficiency of model-free RL with ensembles?
- TD3_BC (https://arxiv.org/abs/2106.06860): An offline RL method that learns policy without interacting with the environment.
- SAC (https://arxiv.org/abs/1801.01290, https://arxiv.org/abs/1805.00909): These two papers offer you a view of how to treat RL as probabilistic inference.
According to your preference, you can choose one of them to understand the paper and to implement the algorithm. For the listed algorithms, we will offer you the reference training curve. If you are interested in other algorithms, you can also choose them in Part II, but we can not offer much help in implementing those algorithms.This project work is supposed to be done in groups of 2 students. If you need to find a partner for the project, please join theproject
channel on Slack and advertise yourself. The deadline for both the course project and the alternative project is 05.12.2021 at 23:55. (edited)Alternative Project
Alternatively, students can also propose their own project topic. This option is mainly aimed at PhD students that want to apply Reinforcement Learning to their own field, but Master's students are also encouraged.
Alternative project topics are individual. The project proposal needs to be submitted and will be evaluated by the staff of the course, and can be started once the project is approved.
The deadline for the alternative course project proposal is 29.10.2021 at 23:55.
Grading
The course project grade accounts for 30% of the final grade of the course.
-
This assignment contains the instructions for students interested in the alternative course project. Please read the instructions carefully and submit your project proposals here.