Credits: 5

Schedule: 28.10.2019 - 16.12.2019

Teaching Period (valid 01.08.2018-31.07.2020): 

II (Autumn)

Learning Outcomes (valid 01.08.2018-31.07.2020): 

After the course, you can describe how natural data such as images, natural language, speech and time series measurements can be represented as data in digital form. You can apply elementary statistical and algorithmic methods to process the digital data to yield insights to the data generating phenomenon. You will understand what processes constitute the data science pipeline in the analysis, starting from natural data and ending with actionable results.

Content (valid 01.08.2018-31.07.2020): 

The course serves as an introduction to the topic of data science and related topics such as machine learning. You will be introduced to data science methods and tools to find interesting information from data. Specific topics on the course include processing of digital signals such as speech and images, statistical estimation of parametric distributions, classification, prediction, clustering, pattern mining, and network analysis for developing search engines for hypertext collections such as the Web.

Assessment Methods and Criteria (valid 01.08.2018-31.07.2020): 

Overall grade is determined by the  exam grade. Attendance in the exercise sessions will earn the student extra exam points.

Elaboration of the evaluation criteria and methods, and acquainting students with the evaluation (applies in this implementation): 

Update: Due to the many suggestions from students in the last year, the grading criteria of the course has been changed this year. This year grading is as follow;

- Weekly assignments (20%)

- Final project (40%)

- Final exam (40%)

The attendance to the computer session and demo sessions WILL NOT gain any bonus point for the students.   

Workload (valid 01.08.2018-31.07.2020): 

Lectures 20h, exercise sessions 20h, independent work 90h, examination 3h.

Details on calculating the workload (applies in this implementation): 

UPDATE: Lectures 16h,  weekly assignments 20h, Final project+final reports 40h, exercise session(Demo+Computer) 20, independent work 35h, examination 3h.

Study Material (valid 01.08.2018-31.07.2020): 

Material will be announced on the course pages.

Substitutes for Courses (valid 01.08.2018-31.07.2020): 

CS-C3110 Datasta tietoon (From Data to Knowledge).

Prerequisites (valid 01.08.2018-31.07.2020): 

Skills needed on the course are taught on  introductory courses in mathematics and statistics and programming. Specifically, matrix algebra, derivatives of functions, and statistical distributions will be needed on the course.

Grading Scale (valid 01.08.2018-31.07.2020): 


Details on the schedule (applies in this implementation): 

The first lecture of the course will be on the 31st of October at 14. Although we will start our computer exercises earlier on Monday the 28th of October. The first-week computer exercise will cover the introductory material for how to work with Python and Jupyter.


Registration and further information