General
Content
The course serves as an introduction to the topic of data science and related topics such as machine learning and data mining. You will be introduced to data science methods and tools to find interesting information from data. Specific topics on the course include processing of digital signals such as speech and images, statistical estimation of parametric distributions, classification, regression prediction, clustering, pattern mining, and network analysis for developing search engines for hypertext collections such as the Web. There are also guest lectures during the lecture sessions that will introduce different aspects of data science in nowadays society and career.
Learning outcomes:
After the course, you can describe how natural data such as images, natural language, speech, and time-series measurements can be represented as data and how to explore and preprocess them. You can apply elementary statistical and algorithmic methods to process the data to yield insights into the data and predict based on the given data. You will understand what processes constitute the data science pipeline in the analysis, starting from natural data and ending with actionable results.
Assessment Methods and Criteria:
The overall grade is determined by exam grade (40%), final project and peer review (40%), and weekly assignment (20%). Attendance in lectures, demonstration exercises, and computer exercises is voluntary. We will provide some help during computer exercises (the one marked with A in the calendar). This is a great way to get support for your weekly assignment or final project. The demonstration sessions (the one marked with H in the calendar) are covering the course material in more detail as the pen-and-paper manner. The exam and final project are mandatory and determine the final grade.
Teaching Period:
II (Autumn)The first lecture of the course will be on the 31st of October at 14. Although we will start our computer exercises earlier on Monday the 28th of October. The first-week computer exercise will cover the introductory material for how to work with Python and Jupyter. The Demonstration session will start on the 4th of November.
Workload:
- Lectures 16h,
- Weekly assignments 20h,
- Final project+final reports 40h,
- Exercise session(Demo+Computer) 20,
- Independent work 35h, examination 3h.
Study Material:
The lecture materials are distributed as slide decks.Demonstration exercises and computer exercises are Jupyter notebooks that include descriptions and Python code in order to solve a data science-related problem. We advise you to attend at least one Computer exercise and one demo session per week (the different sessions in each week are covering the same material).Weekly exercises are a combination of Jupyter notebooks and MyCourses quizzes. They include Python code and datasets, and require to solve a data science problem. The correct answer should be selected on MyCourses assignment page.Basics of using Python, Jupyter notebooks are reviewed during the first week in the computer exercise sessions. Please try to attend at least one of the computer session (the one marked with A) if you are not familiar or comfortable with Python or Jupyter environment.Please attend the first lecture on the 31st of October for more information.