Credits: 5

Schedule: 30.10.2019 - 18.12.2019

Teaching Period (valid 01.08.2018-31.07.2020): 

II 2018 – 2019, 2019 – 2020 (autumn)

Learning Outcomes (valid 01.08.2018-31.07.2020): 

To become familiar with speech recognition methods and applications. Additionally, to learn to understand the structure of a typical speech recognition system and to know how to construct one in practice.

Content (valid 01.08.2018-31.07.2020): 

Preprocessing and feature extraction for speech, phoneme models,
decoding, lexicon and language models, recognition and retrieval of
continuous speech.

Details on the course content (applies in this implementation): 

The focus is both on understanding the basic principles of speech recognition and on practical implementation, so that you understand how a speech recognition system works, its limitations and how to construct one in practice.

These courses or other corresponding ones would provide a broader understanding of some parts of the course. The speech recognition course can be taken without these, but they can be useful prerequisite knowledge. They can also be taken after the course, if one become more interested in the topics:

  • Data Science, 5 cr. This course gives an introduction to machine learning   https://mycourses.aalto.fi/course/view.php?id=24330
  • Deep Learning, 5 cr. This course goes deeper in deep learning
  • Digital Signal Processing, 5 cr, This course gives an introduction to signal processing.
  • Speech Processing, 5 cr. This course gives a wider and deeper view on speech modeling
  • Statistical Natural Language Processing, 5 cr. This course gives a wider and deeper view on language modeling
  • Speech and Language Processing Methods, 5 cr. This course includes more detailed practical work on speech and language modeling for those who want to go deeper than just applying the standard toolkits

Assessment Methods and Criteria (valid 01.08.2018-31.07.2020): 

Exercises and project work.

Elaboration of the evaluation criteria and methods, and acquainting students with the evaluation (applies in this implementation): 

There is no exam. The grade is a combination of three sources:

20% from the lecture activity. Missed lectures can be compensated by returning the extra
exercises, one exercise per lecture. Participation in each lecture and exercise session gives 1 activity point.  New in 2019: Additional points can be obtained by participating in the lecture quizes, 0.1 point for every correct answer. There will be a quiz in lectures 2, 3, 4 and 5 with 5 questions in each. The maximum amount of activity points will be 15. The sum is converted to "activity grade (AG)" by formula:  AG = (activity points - 6) / 9 * 5. The course can not be passed, if AG <= 0. In that case, contact the lecturer for compensation options.

40% from home exercises. The four home exercises give totally 64 points. The sum is converted to
"home exercise grade (HG)" by formula:  HG = (points from exercises -
28) / 36 * 5. The course can not be passed, if HG <= 0. In that case, contact the lecturer for compensation options.

40% from the project work. It depends on literature
review, experiments, talk, final report and self-grading. Groups with
less participants have been compensated in the grade.

The final grade will be the weighted average of the activity (20%), home exercise (40%) and project grades (40%).

Workload (valid 01.08.2018-31.07.2020): 

Lectures: 24h
Home exercises, group projects, and other individual work: 109 h

Attendance in some contact teaching may be compulsory.

Details on calculating the workload (applies in this implementation): Active attendance to Wednesday lectures and reading the material corresponds to 1 cr. Missed lectures can be compensated by the given extra exercises related to each lecture topic.

Participation to the computer exercise sessions and submitting the home exercises corresponds to 2 cr. Note that the content of the Thursday and Friday exercise are identical. Thus, you can choose which one fits better to your schedule. The participation to the sessions is not mandatory, but highly recommended. The assistance for the home exercises, e.g. how to use the toolkits, is only available during these sessions.

Participation to the project work is worth 2 cr. The project work is performed in groups of three students and there will be one researcher tutoring each group. To finish the group work during the teaching period will require active participance every week. The time reserved for the group meetings is Wednesdays at 11 - 12 (right after each lecture) when also the tutors will be available , but the group may decide to have other meeting times. Note that as the groups will start working on Oct 30, they will be composed on Oct 29 by the course assistants. Thus, the participants must separately register to the groups via Mycourses by Oct 28. Information on how this is done will be sent to the participants via Mycourses in the previous week. Your own topic and team suggestions can be attached to the group registration. 

Study Material (valid 01.08.2018-31.07.2020): 

To be specified in the beginning of the course.

Details on the course materials (applies in this implementation): The recommended basic text book is:

  • Huang, Acero: Spoken Language Processing. Prentice Hall, 2001 ISBN: 0-13-022616-5

This one is very advanced level, but worth studying to understand the latest trends:

  • Yu, Deng: Automatic Speech Recognition A Deep Learning Approach. Springer, 2015 ISBN: 978-1-4471-5779-3

Other useful text books:

  • Rabiner, Juang: Fundamentals of Speech Recognition. Prentice Hall, 1993.
  • Jurafsky, Martin: Speech and Language Processing. A draft of the 3rd edition online, 2018.

Substitutes for Courses (valid 01.08.2018-31.07.2020): 

T-61.5150, S-89.5150

Course Homepage (valid 01.08.2018-31.07.2020): 

https://mycourses.aalto.fi/course/search.php?search=ELEC-E5510

Prerequisites (valid 01.08.2018-31.07.2020): 

Basic mathematics and probability courses.

Grading Scale (valid 01.08.2018-31.07.2020): 

0...5

Registration for Courses (valid 01.08.2018-31.07.2020): 

In WebOodi

Further Information (valid 01.08.2018-31.07.2020): 

Language class 3: English

Details on the schedule (applies in this implementation): First 4 weeks: Lectures on Wednesdays, Computer exercises on Thursdays/Fridays, Homework every week. Topics will cover the basics of automatic speech recognition.

Next 3 weeks: Lectures on Wednesdays and Fridays. No exercises. Topics will be more advanced and the groups will present their findings.

Description

Registration and further information