Topic outline

  • General

    Welcome to the course!


    Teachers

    The responsible teacher of the course is Senior University Lecturer Jorma Laaksonen <jorma.laaksonen@aalto.fi>, room B326 in the CS building. The course assistants are: Yinjia Zhang (head assistant), Paavo Reinikka (head assistant), Hau Phan, Maximilian Krahn, Zixuan Liu and Martino Ciaperoni. Lectures will be given additionally by Senior University Lecturers Jaakko Hollmén and Wilhelmiina Hämäläinen and Professors Petter Holme and Heikki Mannila.

    Overview

    The course gives an overview of the main principles and methods of data mining and how to apply them on real world problems. It introduces the most fundamental pattern types and their search methods, including associative and graph patterns, main approaches to clustering large-dimensional and/or heterogeneous data, web and text mining, social community detection and validation of data mining results.

    Prerequisites

    Good programming skills (CS-A1110 or equivalent), data structures and algorithms (CS-A1140 or equivalent), basic knowledge of probability theory and statistics (MS-A050* or equivalent). Linear algebra is not an official requirement, but some basic knoweldge on matrices is needed.

    Material

    The course is based on textbook Charu C. Aggarwal: Data mining - the textbook. Springer 2015. The e-book is available in Aalto library (login to aalto-primo).  In addition, there will by some external material (linked to the course page). The learning material on each topic will be listed in section Lectures, under each lecture.

    Lectures slides, exercise tasks and other material will be added here in MyCourses.

    Workload

    The expected average workload (about 135h) consists of 36h contact sessions (lectures and exercises), 20h preparation for exercises, 20h homeworks, 40h self-studying and 20h preparation for the exam. It is suggested that everybody self-studies about 3h after each lecture - then exercise sessions are most rewarding, assignments go more easily and there is little work to prepare for the exam. If you skip lectures or exercises, you'll need to self-study more to compensate them.

    Grading

    Course performance consists of three elements:
    • group presentations in five exercise sessions (5 points)
    • five graded homeworks made in groups (10 points)
    • final exam (20 points)

    The course grade is based on the sum of the points in all three categories above. To pass the course one should get 50% of total points.

    Communication

    All important course related announcements are published in MyCourses announcements (visible on this page and by default also emailed to course participants). For wider discussion, questions and advising, we have zulip chat https://mdm-fall-2022.zulip.aalto.fi/ . Join it through the link https://mdm-fall-2022.zulip.aalto.fi/join/2vycalqqcifhdtbiammb35ci/ .

    You are encouraged to ask during the lectures and exercise sessions. Please, use email only for personal matters that you cannot ask elsewhere.

    More information on practical arrangements in the first lecture!