Credits: 5

Schedule: 11.09.2019 - 19.12.2019

Teaching Period (valid 01.08.2018-31.07.2020): 

I-II (Autumn)

Learning Outcomes (valid 01.08.2018-31.07.2020): 

This course focuses on advanced scalable cloud computing technologies and on key algorithmic ideas and methods used to implement them. After completing this course you are able to list many of the key technologies used in big data processing and to select suitable methods for solving challenging big data processing tasks using cloud computing technologies. You will also be able to compare the scalability and fault tolerance implications of using the selected methodologies.

Content (valid 01.08.2018-31.07.2020): 

Advanced topics in cloud computing with emphasis on scalable distributed computing technologies employed in cloud computing. Key cloud technologies and their algorithmic background. Main topics are distributed file systems, distributed batch processing with the MapReduce and the Apache Spark computing frameworks, and distributed cloud based databases.

Details on the course content (applies in this implementation): 

The course will include lectures and tutorials covering various aspects of big data platforms.

  • Big data platforms from design viewpoint
  • Big data platforms and cloud computing
  • Service and integration models in big data platforms
  • Core services in big data platforms
  • Data ingestion in big data platforms
  • Big data processing with Apache Spark, stream processing and workflows of data pipelines
  • Governance issues for big data platforms
  • Quality of analytics

See the lectures and tutorials sessions for further information.

Assessment Methods and Criteria (valid 01.08.2018-31.07.2020): 

Exam and home assignments. Course feedback.

Elaboration of the evaluation criteria and methods, and acquainting students with the evaluation (applies in this implementation): 

  • Assessment will be based on individual home  assignments related to design, programming and analysis aspects in big data platforms.
  • Assignments will be evaluated based on the quality of assignment submissions. The teacher might require students to explain the submissions in person, when needed.
  •  There will be no final exam.

See the description of assignments for further detail.

Workload (valid 01.08.2018-31.07.2020): 

Lectures: 24 (2), Teaching in small groups: 12 (1), Independent work: 96

Study Material (valid 01.08.2018-31.07.2020): 

Lecture slides, tutorial assignments and their answers.

Details on the course materials (applies in this implementation): 

  • Lecture slides will be available
  • Tutorial information
  • External materials

Substitutes for Courses (valid 01.08.2018-31.07.2020): 

Replaces former courses CSE-E5430 / T-79.5308 Scalable Cloud Computing and T-79.5307 Distributed Computing.

Prerequisites (valid 01.08.2018-31.07.2020): 

Basic programming skills (CS(E)-A1110 Programming 1). Familiarity with basic data structures (CS(E)-A1140 Data Structures and Algorithms or CS-E3190 / T-79.4202 Principles of Algorithmic Techniques) an asset.

Grading Scale (valid 01.08.2018-31.07.2020): 


Additional information for the course (applies in this implementation): 

  • Every student will get 50 USD credit for Google Cloud Platform that the student can use to acquire resources/services for exercise


Registration and further information