Please note! Course description is confirmed for two academic years (1.8.2018-31.7.2020), which means that in general, e.g. Learning outcomes, assessment methods and key content stays unchanged. However, via course syllabus, it is possible to specify or change the course execution in each realization of the course, such as how the contact sessions are organized, assessment methods weighted or materials used.
LEARNING OUTCOMES
After completing the course, a student can: (I) explain main concepts and approaches related to decision making and learning in stochastic time series systems; (ii) read scientific literature to follow the developing field; (iii) implement algorithms such as value iteration and policy gradient.
Credits: 5
Schedule: 07.09.2020 - 02.12.2020
Teacher in charge (valid 01.08.2020-31.07.2022): Ville Kyrki
Teacher in charge (applies in this implementation): Ville Kyrki
Contact information for the course (valid 24.08.2020-21.12.2112):
Lecturer (for course registration, lectures, etc.): Ville Kyrki (ville.kyrki@aalto.fi), or after lectures.
TAs (for assignments, project): Karol Arndt, David Blanco Mulero, Oliver Struckmeier. Preferably via Slack.
CEFR level (applies in this implementation):
Language of instruction and studies (valid 01.08.2020-31.07.2022):
Teaching language: English
Languages of study attainment: English
CONTENT, ASSESSMENT AND WORKLOAD
Content
Valid 01.08.2020-31.07.2022:
Modeling uncertainty. Markov decision processes. Model-based reinforcement learning. Model-free reinforcement learning. Function approximation. Policy gradient. Partially observable Markov decision processes.
Assessment Methods and Criteria
Valid 01.08.2020-31.07.2022:
Assignments and project work.
Applies in this implementation:
Grading 0-5. Quizzes 20 %, Assignments 50 %, Project 30 %. No exam.
To pass: Completed assignments. Completed project.
Workload
Valid 01.08.2020-31.07.2022:
Contact teaching, independent study, assignments, project
Contact teaching 56 h
Independent study 74 h
DETAILS
Study Material
Valid 01.08.2020-31.07.2022:
Lecture notes. On-line material.
Applies in this implementation:
Lecture slides.
Sutton&Barto, "Reinforcement learning" (parts).
LaValle, "Planning Algorithms" (parts).
All available on-line.
Prerequisites
Valid 01.08.2020-31.07.2022:
Required: Basic programming skills, basic calculus (gradient), basic vector and matrix algebra, basic probability (random variables, expectation)
Recommended: Artificial Intelligence
Useful: Machine learning - basic principles, Digital and optimal control, Stochastics and estimation