Topic outline

  • Databases for Data Science

    Course Description: This is an introductory course on relational databases designed for Data Science BSc students; so no prior knowledge of databases is assumed. The course covers the fundamentals of relational algebra, the design of the relational schema including the Unified Modeling Language (UML), functional dependency and normal forms, the concept of transactions, creating SQL tables (including indexes), and using SQL to query the database. 

    Following the course, the students will have the know-how to design and implement relational databases that meet the normalization rules. Moreover, the student should be able to use SQL to write and run various types of queries so as to extract the desired data from the database, an essential part when analyzing data. In particular, the course will draw on relevant examples to prepare students to apply the principles of relational databases to projects in data science.

    Note: This course is designed for Data Science BSc students; for other students we recommend taking the course CS-A1150 Tietokannat / Databases offered by Kerttu Pollari-Malmi.

    If you are a non-Data Science major and have registered for the course kindly send the instructors a brief email note with your reasons (e.g. you're following an English language major, are graduating soon, are an exchange student, etc.) for taking this Database course  in particular (and not CS-A1150). We'll take all reasonable requests into consideration.

    For all the email addresses below, the domain is


    • Prof. Nitin Sawhney (email: nitin.sawhney@domain)
    • Dr. Barbara Keller (email: barbara.keller@domain)
    • Sami El-Mahgary (email: sami.mahgary@domain)

    Teaching Assistants:

    • Etna Lindy (email: etna.lindy@domain)
    • Long Nguyen (email: long.l.nguyen@domain)
    • Trang Nguyen (email: trang.m.nguyen@domain)
    • Pham Binh (email: binh.pham@domain)
    • Sophie Truong (email: lac.truong@domain)
    • Ville Vuorenmaa (email: ville.vuorenmaa@domain)

    Online Learning Sessions: Tuesdays 16:15 - 18:00 via Zoom and Slack 

    Exercise Sessions: Wed / Thurs / Fri 10:15 - 12:00. Exception: Session on Thursday 13th of May is moved to Friday 14th of May from 14:15 - 16:00

    Tentative Weekly Course Schedule:

    • Lectures: 2-part sessions (30-40 mins + QA) with short break
    • Exercise Sessions: Hands-on sessions with applied examples and group projects

    Exam: There is no exam planned for this course.

    Grading: The grading of the course is based on the homework exercises (50%) and the group project (50%). To pass the course, you are required to get at least 75/150 points from the homework exercises and 75/150 from the project. For active participation one can receive up to 30 bonus points. Students are supposed to attend at least 5 out of 6 exercise sessions. Students receive 10 bonus points for answering the course feedback form at the end of the course.

    • Project: 50% (+150 Points)
    • Exercises: 5 x 10% = 50% (+150 Points)
    • Bonus Participation (peer support and interactions): 10% (+30 Points)
    • Not attending at least 5 out of 6 exercise sessions (-30 Points)
    • Course feedback bonus (+10 points)

    Grading scale
    • <150 OR <75 for either exercise or project = fail
    • 150 - 179 = 1
    • 180 - 209 = 2
    • 210 - 239 = 3
    • 240 - 269 = 4
    • >270 = 5