Topic outline

  • Note: the registration has ended. The first lecture is Monday 4th September.

    Teachers

    The responsible teacher of the course is senior university lecturer Wilhelmiina Hämäläinen (wilhelmiina.hamalainen@aalto.fi). The teaching assistants are Egor Eremin, Georgy Ananov, Hieu Nguen Khac, Lai Khoa, Paavo Reinikka, Vinh Nguyen Mai, and Yinjia Zhang.


    Overview


    The course gives an overview of the main principles and methods of data mining and how to apply them on real world problems. It introduces the most fundamental pattern types and their search methods, including associative and graph patterns, main approaches to clustering large-dimensional and/or heterogeneous data, web and text mining, social community detection and validation of data mining results.

    Prerequisites


    Good programming skills (CS-A1110 or equivalent), data structures and algorithms (CS-A1140 or equivalent), basic concepts and techniques of probability and statistics (MS-A050* or equivalent) and linear algebra (MS-A00* or equivalent). Statistical inference (MS-C1620 or equivalent) is recommended. A prerequisite test is coming soon. It will help you to evaluate if you need to recap something and will also give one point.

    Material


    The course is based on textbook Charu C. Aggarwal: Data mining - the textbook. Springer 2015. The e-book is available in Aalto library (login to aalto-primo).  In addition, there will by some external material (linked to the course page). The learning material on each topic will be listed in section Lectures, under each lecture.

    Lectures slides, exercise tasks and other material will be added here in MyCourses.

    Workload


    The expected average workload (about 135h) consists of 34-36h contact sessions (lectures and exercises), 20h preparation for exercises, 20h homeworks, 40h self-studying and 20h preparation for the exam. It is suggested that everybody self-studies about 3h after each lecture - then exercise sessions are most rewarding, assignments go more easily and there is little work to prepare for the exam. If you skip lectures or exercises, you'll need to self-study more to compensate them.

    Grading

    Course performance consists of four elements:

    1.     activite participation in exercise groups (5 sessions, group works, max 5p)
    2.     submitting homeworks in groups of 2–3 students (5 tasks, max 10p)
    3.     final exam (Wed 13.12. 13:00–16:00, max 24p)
    4.     prerequisite test (max 1p)

    The course grade is based on the sum of the points in all four categories above. To pass the course one should get 50% of total points and 50% of the exam points.

    Communication


    All important course related announcements are published in MyCourses announcements (visible on this page and by default also emailed to course participants). For wider discussion, questions and advising, we have zulip chat https://mdm2023.zulip.aalto.fi/ Here is a tutorial on using zulip.


    You are encouraged to ask in zulip, during the lectures and exercise sessions. Please, use email only for personal matters that you cannot ask elsewhere to avoid email chaos.
    More information on practical arrangements in the first lecture!