Please note! Course description is confirmed for two academic years (1.8.2018-31.7.2020), which means that in general, e.g. Learning outcomes, assessment methods and key content stays unchanged. However, via course syllabus, it is possible to specify or change the course execution in each realization of the course, such as how the contact sessions are organized, assessment methods weighted or materials used.


After attending the course, the student knows how statistical and adaptive methods are used in information retrieval, machine translation, text mining, speech processing and related areas to process natural language contents. Furthermore, the student can apply the basic methods and techniques used for statistical natural language modeling including, for instance, clustering, classification, Hidden markov models and Bayesian models.

Credits: 5

Schedule: 12.01.2021 - 14.04.2021

Teacher in charge (valid 01.08.2020-31.07.2022): Paavo Alku, Mikko Kurimo

Teacher in charge (applies in this implementation): Mikko Kurimo

Contact information for the course (applies in this implementation):

CEFR level (applies in this implementation):

Language of instruction and studies (valid 01.08.2020-31.07.2022):

Teaching language: English

Languages of study attainment: English


  • Valid 01.08.2020-31.07.2022:

    Many core applications in modern information society such as search engines, social media, machine translation, speech processing and text mining for business intelligence apply statistical and adaptive methods. This course provides information on these methods and teaches basic skills on how they are applied on natural language data. Each topic is handled by a high level expert in the area.

Assessment Methods and Criteria
  • Valid 01.08.2020-31.07.2022:

    Examination and exercise work.

  • Valid 01.08.2020-31.07.2022:

    Lectures and excercise sessions approximately 30 h

    Independent work approximately 103 h

    Total 133 h

    Attendance in some contact teaching may be compulsory


Study Material
  • Valid 01.08.2020-31.07.2022:

    C. Manning, H. Schütze, 1999. Foundations of Statistical Natural Language Processing. The MIT Press; Lecture notes.

Substitutes for Courses
  • Valid 01.08.2020-31.07.2022:

    T-61.5020 Statistical Natural Language Processing P

  • Valid 01.08.2020-31.07.2022:

    Basic mathematics and probability courses.

SDG: Sustainable Development Goals

    9 Industry, Innovation and Infrastructure



Registration and further information