Topic outline

  • Advanced Topics in Software Systems

    Notes: The course will be in the campus - no online version and no remote teaching



    Why Should we focus on Systems for Big Data and Machine Learning

    Complex Big Data and Machine Learning (ML) applications and services and their reliability and robustness are strongly dependent on the underlying systems empowering such applications and services. On the one hand, techniques for supporting performance engineering, configuration management, testing and debugging of Big Data and ML are extremely important. On the other hand, large-scale distributed systems and new computing models have been evolved with new hardware and infrastructure architectures, such as edge systems, tensor processing units, and quantum computing systems. Such systems and computing models are being exploited for advanced Big Data and ML applications and services. Developing and optimizing Big Data and ML applications and services in such systems and models require in-depth understanding of the systems and the roles of systems for Big Data and ML.  In the research community and big industries, some aspects mentioned above are also  described in "The New Frontier of Machine Learning Systems" This course will study advanced topics in systems for big data and ML/AI.

    Here is the link to the Sisu of the course.


    Topics in Systems for BigData and ML

    Overview

    The focus of this course is on some selected topics in the intersection of BigData/ML and Systems:
    intersection

    First, key system requirements due to the complexity, reliability, and robustness of Big Data and ML applications and services will be analyzed and presented. Based on that we will learn techniques for supporting performance engineering, configuration management, testing and debugging of Big Data and ML. Such techniques are extremely important; they are cross-topics for the course, regardless of the underlying systems empowering Big Data and ML applications and services.

    Second, selected areas in systems for Big Data and ML will be presented. For each selected area, we will examine the state-of-the-art, strengths and weakness of concepts and techniques. We will focus on engineering frameworks that can be used to development Big Data and ML, according to the above-mentioned cross-topics.

    Focused Areas in 2020-2022

    Advanced Topics in Software Systems will focus on the following areas:

    • Design and evaluation for systems robustness, reliability, resilience and elasticity for Big Data/ML (with also engineering work)
    • Test, debug, monitoring, and configuration management (with also engineering work)
    • Dataflows and Orchestration Frameworks for Big Data/ML (with also engineering work)
    • Edge systems and edge-cloud continuum for Big Data/ML (with also engineering work)
    • New hardware architectures and quantum systems for Big Data/ML (more on the concepts and state-of-the-art)

    Registration

    A maximum 16 participants will be accepted. Registration must be approved by the responsible teacher. The system allows to register the mycourses but all registrations are pending to be confirmed by the head teacher.


    Manage Your work in the course

    As you join the course, you should:
    • Manage all your output in a GIT. The GIT will be public at the end of the course, unless you have a strong reason w.r.t. IP that prevents us doing so (this has to be discussed). Therefore, we suggest you to use version.aalto.fi or other public git services like GitHub, GitLab or Bitbucket. Note that the GIT does not have to be public before the demonstration session.
    • Artefacts in your git will be: study logs, your topic presentation, and code of your individual project. For grading and commenting purpose, you will have to copy study logs and topic presentation and submit them into mycourses.aalto.fi but this will not consume much time.
    This way is to ensure that everyone can manage one's results in a nice story contained in a public git space.

    CS-E4660 Course GIT

    The GIT of CS-E4660 contains various materials for the course: https://version.aalto.fi/gitlab/sys4bigml/cs-e4660