LEARNING OUTCOMES
After completing the course, students will be able to
- Define the requirements, building blocks, and challenges when architecting, building and managing a large-scale data-center infrastructure for distributed systems and databases.
- Know the design principles and implement code for engineering scalable data-intensive systems and their applications; analyze the full system stack for managing and scheduling data-center resources in relation to distributed storage, coordination and computation.
- Critically assess the trade-offs between different requirements when designing scalable distributed systems; Understand the trade-offs in converting between data models and database tools.
- Understand new data models and new storage technologies, as well as their impacts on query execution, database systems, cloud platforms, data processing pipeline, and modern machine learning systems.
- Discuss, compare and criticize the state-of-the-art research approaches presented in research papers targeting distributed systems, databases and machine learning systems.
Credits: 5
Schedule: 06.09.2024 - 29.11.2024
Teacher in charge (valid for whole curriculum period):
Teacher in charge (applies in this implementation): Zhao
Contact information for the course (applies in this implementation):
CEFR level (valid for whole curriculum period):
Language of instruction and studies (applies in this implementation):
Teaching language: English. Languages of study attainment: English
CONTENT, ASSESSMENT AND WORKLOAD
Content
valid for whole curriculum period:
This course is on the design and implementation of scalable data management systems. Topics include data-center technologies, data models (relational, document, key/value), storage models, query languages, storage architectures, indexing, query processing and optimization, in-memory databases, distributed storage, distributed coordination (consensus protocols and use-cases), transaction processing and concurrency control, new storage media, and parallel architectures (multicores/multi-socket/chiplet), as well as studies on open-source/commercial distributed database systems to illustrate these techniques and trade-offs.
Assessment Methods and Criteria
valid for whole curriculum period:
Exercises and assignments.
DETAILS
Study Material
valid for whole curriculum period:
Lecture slides, tutorials, open-source software, scientific papers, and assignments
Substitutes for Courses
valid for whole curriculum period:
Prerequisites
valid for whole curriculum period:
SDG: Sustainable Development Goals
4 Quality Education
5 Gender Equality
8 Decent Work and Economic Growth
9 Industry, Innovation and Infrastructure
11 Sustainable Cities and Communities
13 Climate Action
FURTHER INFORMATION
Further Information
valid for whole curriculum period:
Teaching Language: English
Teaching Period: 2024-2025 Autumn I - II
2025-2026 Autumn I - IIRegistration:
Participation is subject to a maximum quota of 50, enrollments will be prioritized according to the following criteria: master students for which the course is mandatory; students with strong systems programming skills (e.g., C/C++/Rust); the rest of the students fulfilling pre-requisites.