Credits: 5

Schedule: 30.10.2017 - 18.12.2017

Teaching Period (valid 01.08.2018-31.07.2020): 

II (Autumn)

Learning Outcomes (valid 01.08.2018-31.07.2020): 

After the course, you can describe how natural data such as images, natural language, speech and time series measurements can be represented as data in digital form. You can apply elementary statistical and algorithmic methods to process the digital data to yield insights to the data generating phenomenon. You will understand what processes constitute the data science pipeline in the analysis, starting from natural data and ending with actionable results.

Content (valid 01.08.2018-31.07.2020): 

The course serves as an introduction to the topic of data science and related topics such as machine learning. You will be introduced to data science methods and tools to find interesting information from data. Specific topics on the course include processing of digital signals such as speech and images, statistical estimation of parametric distributions, classification, prediction, clustering, pattern mining, and network analysis for developing search engines for hypertext collections such as the Web.

Assessment Methods and Criteria (valid 01.08.2018-31.07.2020): 

Overall grade is determined by the  exam grade. Attendance in the exercise sessions will earn the student extra exam points.

Workload (valid 01.08.2018-31.07.2020): 

Lectures 20h, exercise sessions 20h, independent work 90h, examination 3h.

Study Material (valid 01.08.2018-31.07.2020): 

Material will be announced on the course pages.

Substitutes for Courses (valid 01.08.2018-31.07.2020): 

CS-C3110 Datasta tietoon (From Data to Knowledge).

Prerequisites (valid 01.08.2018-31.07.2020): 

Skills needed on the course are taught on  introductory courses in mathematics and statistics and programming. Specifically, matrix algebra, derivatives of functions, and statistical distributions will be needed on the course.

Grading Scale (valid 01.08.2018-31.07.2020):