Please note! Course description is confirmed for two academic years, which means that in general, e.g. Learning outcomes, assessment methods and key content stays unchanged. However, via course syllabus, it is possible to specify or change the course execution in each realization of the course, such as how the contact sessions are organized, assessment methods weighted or materials used.

LEARNING OUTCOMES

After the course, the students have an overview of the main principles and methods of data mining and know how to apply them on real world problems. They know the most fundamental pattern types and their search methods, including associative and graph mining, main approaches to cluster large-dimensional and heterogenous data, basic
concepts and techniques in social network analysis, web and text mining, and how to validate the data mining results.

Credits: 5

Schedule: 04.09.2023 - 13.12.2023

Teacher in charge (valid for whole curriculum period):

Teacher in charge (applies in this implementation): Wilhelmiina Hämäläinen

Contact information for the course (applies in this implementation): The responsible teacher of the course is senior university lecturer Wilhelmiina Hämäläinen (wilhelmiina.hamalainen@aalto.fi). The course assistants are Egor Eremin, Georgy Ananov, Hieu Nguen Khac, Lai Khoa, Paavo Reinikka, Vinh Nguyen Mai, and Yinjia Zhang.


CEFR level (valid for whole curriculum period):

Language of instruction and studies (applies in this implementation):

Teaching language: English. Languages of study attainment: English

CONTENT, ASSESSMENT AND WORKLOAD

Content
  • valid for whole curriculum period:

    The course covers fundamental data mining problems, such as pattern discovery, graph mining, and clustering different types of data. The main emphasis is in learning the basic principles of data mining and their application in practice, including method selection, validation, and scalablity issues.

Assessment Methods and Criteria
  • valid for whole curriculum period:

    Home assignments, project work, examination.

  • applies in this implementation

    Course performance consists of four elements:

    1. activite participation in exercise groups (5 sessions, group works, max 5p)
    2. submitting homeworks in groups of 2–3 students (5 tasks, max 10p)
    3. final exam (Wed 13.12. 13:00–16:00, max 24p)
    4. prerequisite test (max 1p)
    The course grade is based on the sum of the points in all four categories above. To pass the course one should get 50% of total points and 50% of the exam points.

Workload
  • valid for whole curriculum period:

    Study methods consist of lectures (24h), exercise sessions (8h), home assignments and project work (about 76h), self-studying (24h), and exam. Lectures and exercise sessions are voluntary and can be replaced by self studying.

  • applies in this implementation

    The expected average workload (about 135h) consists of 34-36h contact sessions (lectures and exercises), 20h preparation for exercises, 20h homeworks, 40h self-studying and 20h preparation for the exam. It is suggested that everybody self-studies about 3h after each lecture - then exercise sessions are most rewarding, assignments go more easily and there is little work to prepare for the exam. If you skip lectures or exercises, you'll need to self-study more to compensate them.

DETAILS

Study Material
  • applies in this implementation

    The course is based on textbook Charu C. Aggarwal: Data mining - the textbook. Springer 2015. The e-book is available in Aalto library (login to aalto-primo).  In addition, there will by some external material (linked to the course page). The learning material on each topic will be listed in section Lectures, under each lecture.

    Lectures slides, exercise tasks and other material will be added here in MyCourses.


Substitutes for Courses
Prerequisites

FURTHER INFORMATION

Further Information
  • valid for whole curriculum period:

    Teaching Language: English

    Teaching Period:

    2022-2023 Autumn I - II
    2023-2024 Autumn I - II

    Maximum number of students on the course: 150

    Students are given priority as follows:

    1) Students studying in Computer, Communication and Information Sciences and majoring in Machine Learning, Data Science and Artificial Intelligence; or majoring in Computer Science, track Big Data and Large Scale Computing; or students studying in ICT Innovation programme and majoring in Data Science.

    2) Students studying in Computer, Communication and Information Sciences and majoring in Computer Science or Security and Cloud Computing; or students studying minor in Machine Learning, Data Science and Artificial Intelligence.

    3) Students studying in Aalto Bachelor's programme in Science and Technology majoring in Data Science.

    4) Students studying Bachelor's Programme in Science and Technology students studying in Computer Science major.

    5) Other students studying in Computer, Communication and Information Sciences or ICT Innovation programme.

    6) Other students.

Details on the schedule
  • applies in this implementation

    Note that registration ends 31st August 2023.