Topic outline

  • In this list, the 2021 slides will be replaced by the 2022 ones after each lecture is given at latest. The titles may be identical, but the contents are improved each year based on feedback. The project works and their schedule changes each year. 

    For practicalities, e.g. regarding to the Lecture Quizzes and Exercises, check MyCourses > Course Practicalities

    • Lecture 1 - Feature extraction

    • File icon
      • course organization
      • what is ASR
      • features of speech
      • MFCC
      • GMM
      • DNN
    • Assignment icon

      Please upload your answer here, e.g. as a photo, text or pdf file

    • Lecture 2 - Phoneme modeling

    • File icon
      • Phonemes
      • HMMs
    • Assignment icon
    • Lecture 3 - Language Modeling

    • File icon
      • lexicon
      • language modeling
      • n-grams, smoothing
    • File icon

      • Intro to NNLM
      • Recurrent neural network language models
      • Long Short-Term Memory language models
      • Attention


    • Assignment icon
    • Lecture 4 - Continuous speech and decoding

    • File icon
      • recognition in continuous speech
      • token passing decoder
      • improving the recognition performance and speed
      • measuring the recognition performance
    • Assignment icon
      Lecture exercise 4: Token passing decoder Assignment

      Fill in the last column with final probabilities of the tokens, select the best token and output the corresponding state sequence!

      The goal is to verify that you have the learned the idea of the Token passing decoder. The extremely simplified HMM system is almost the same as in the 2B Viterbi algorithm exercise. The observed "sounds" are just quantified to either "A" or "B" with given probabilities in states S0 and S1.  Now the task is to find the most likely state sequence that can produce the sequence of sounds A, A, B using a simple language model (LM). The toy LM used here is a look-up table that tells probabilities for different state sequences, (0,1), (0,0,1) etc., up to 3-grams.

      Hint: You can either upload an edited source document, a pdf file, a photo of your notes or a text file with numbers. Whatever is easiest for you. To get the activity point the answer does not have to be correct.


    • Lecture 5 - End-to-end ASR with deep neural networks


    • Assignment icon
    • File icon

      Preparation for the seminar sessions during the last two weeks

      Updated presentation schedule for the groups

    • Program for the last two weeks

    • Wed 30 November: Presentations 1

    • Fri 2 December: Presentations 2 - 5

    • Wed 7 December: Presentations 6 - 9

    • Fri 9 December: Presentations 10 - 13


    • Presenters
      • Two days before (or earlier if possible): Select one article for others to read and send the link to everybody in MyCourses discussion forum
      • One day before: Upload your slides in MyCourses. The latest version of slides will be published for others in MyCourses. You can also share a draft of the slides or a link in the discussion forum.
      • Practise to make sure that you will not exceed the 20 mins limit
      • Remember your “audience” duties for the other talks of your day
      Audience
      For each talk do this:

      • One day before: Read the provided articles and prepare one question to ask for each talk
      • Follow the talk and the slides and ask your question in chat
      • After the talk (max 1 day): Submit feedback (one for each talk) in MyCourses (all fields are required) to get activity points. The anonymous feedback (pros and cons) will be later shown to the presenters