Please note! Course description is confirmed for two academic years, which means that in general, e.g. Learning outcomes, assessment methods and key content stays unchanged. However, via course syllabus, it is possible to specify or change the course execution in each realization of the course, such as how the contact sessions are organized, assessment methods weighted or materials used.

LEARNING OUTCOMES

After the course, the students have an overview of the main principles and methods of data mining and know how to apply them on real world problems. They know the most fundamental pattern types and their search methods, including associative, graph and sequence mining, main approaches to cluster large-dimensional and heterogenous data, and how to validate the data mining results.

Credits: 5

Schedule: 14.09.2021 - 15.12.2021

Teacher in charge (valid for whole curriculum period):

Teacher in charge (applies in this implementation): Wilhelmiina Hämäläinen

Contact information for the course (applies in this implementation):

CEFR level (valid for whole curriculum period):

Language of instruction and studies (applies in this implementation):

Teaching language: English. Languages of study attainment: English

CONTENT, ASSESSMENT AND WORKLOAD

Content
  • valid for whole curriculum period:

    The course covers fundamental data mining problems, such as pattern discovery, graph mining, and clustering different types of data. The main emphasis is in learning the basic principles of data mining and their application in practice, including method selection, validation, and scalablity issues.

  • applies in this implementation

    Syllabus
            Introduction to Data mining
            Data preprocessing
            Distance and similarity
            Clustering
            Association mining
            Graph mining
            Web mining and recommendation systems
            Social network analysis
            Text mining
            Optional topics (like outlier detection, sequential patterns, applications)

Assessment Methods and Criteria
  • valid for whole curriculum period:

    Home assignments, project work, examination.

Workload
  • valid for whole curriculum period:

    Contact teaching 24h lectures + 12h exercises; self studying 90-100h (home assignments, project work, exam preparation).

DETAILS

Study Material
  • valid for whole curriculum period:

    Lecture slides and external material. The course book will be announced later.

  • applies in this implementation

    • Textbook: Charu C. Aggarwal: Data Mining: The Textbook, Springer 2015.
      E-book available in Aalto Library.
    • Lecture slides and possible external material.

Substitutes for Courses
Prerequisites

FURTHER INFORMATION

Further Information
  • valid for whole curriculum period:

    Teaching Period:

    2020-2021 Autumn I-II

    2021-2022 Autumn I-II

    Course Homepage: https://mycourses.aalto.fi/course/search.php?search=CS-E4650

    Registration for Courses: In the academic year 2021-2022, registration for courses will take place on Sisu (sisu.aalto.fi) instead of WebOodi.

  • applies in this implementation

    Obligatory prerequisites: Programming skills (CS-A1110 or equivalent), data structures and algorithms (CS-A1140 or equivalent), basic concepts of probability and statistics (MS-A050* or equivalent). In addition, some knowledge on linear algebra is highly recommended.

Details on the schedule
  • applies in this implementation

    Lectures on Tuesdays at 16:15-18:00  14.9.-19.10. and 2.11.-7.12. 2021.

    Course exam Wed 15.12. 2021. (re-exam 23.2. 2022)

    Exercise sessions (not every week) begin on week 38. Participation is optional, but highly recommended.

    No teaching on the exam week 25.10.-31.10. 2021.