Please note! Course description is confirmed for two academic years, which means that in general, e.g. Learning outcomes, assessment methods and key content stays unchanged. However, via course syllabus, it is possible to specify or change the course execution in each realization of the course, such as how the contact sessions are organized, assessment methods weighted or materials used.

LEARNING OUTCOMES

Technical content

  • Understanding of and experience in the basics of speech modeling
  • Knowledge of main areas of speech processing
  • Ability to find and use pre-trained models and datasets for speech processing
  • Ability to modify and train neural models for speech processing
  • Experience with objective evaluation of speech processing including related statistical analysis

Human and societal aspects

  • Understanding of basics of speech production, linguistics, and speech expression
  • Awareness of issues of diversity, discrimination, privacy, and sustainability in speech processing
  • Experience with subjective evaluation and knowledge of their design constraints

Credits: 5

Schedule: 02.09.2024 - 14.10.2024

Teacher in charge (valid for whole curriculum period):

Teacher in charge (applies in this implementation): Tom Bäckström

Contact information for the course (applies in this implementation):

CEFR level (valid for whole curriculum period):

Language of instruction and studies (applies in this implementation):

Teaching language: English. Languages of study attainment: Finnish, Swedish, English

CONTENT, ASSESSMENT AND WORKLOAD

Content
  • valid for whole curriculum period:

    • Systems structures and application areas
    • Basic digital speech processing, including short-time spectral analysis and processing, envelope models, fundamental frequency, linear predictive coding, mel-frequency cepstral coefficients
    • Fundamental pre-processing methods, including voice activity detection and speech enhancement
    • Speech processing applications, including speech coding, speech enhancement, voice conversion, and speaker identification
    • Quality evaluation using objective and subjective methods, and basic statistical analysis of results
    • Machine learning for speech, including working with audio and datasets, finding and using pre-trained models, building own models, overall workflow

Assessment Methods and Criteria
  • valid for whole curriculum period:

    Examination and assignments.

Workload
  • valid for whole curriculum period:

    • Lecture sessions
    • Exercise sessions
    • Independent group work
    • Own work
    • Exam

    Attendance in the exam is compulsory.

DETAILS

Study Material
  • valid for whole curriculum period:

    Introduction to speech processing, https://speechprocessingbook.aalto.fi

Substitutes for Courses
Prerequisites
SDG: Sustainable Development Goals

    5 Gender Equality

    9 Industry, Innovation and Infrastructure

    10 Reduced Inequality

    12 Responsible Production and Consumption

FURTHER INFORMATION

Further Information
  • valid for whole curriculum period:

    Teaching Language: English

    Teaching Period: 2024-2025 Autumn I
    2025-2026 Autumn I