Topic outline

  • A preliminary list of topics has been collected below, but students can also propose their own topic. A supercomputing environment will be set up for the course participants (hosted at CSC), hence the number of participants is limited to 10-12 students. Depending on the selection of topics, the students can either work individually or in teams. After introductory lecture(s) on how the environment works, the instructors of the topic will set up weekly meetings with the student (teams), and give materials and instructions on the project (introduction to the tool, documentations, papers, demonstration of the usage). After introductory session(s), the weekly meetings will serve as project update sessions. In the end of the course, the student (teams) are expected to give demonstrations to other student (teams) on their project. Participation in 80% of the sessions and a successful demonstration session to other students is required to pass the course. A session participation missed can be compensated by writing a learning diary in the form of a short email (or equivalent) to the instructor.

    Preliminary topics

    • Evaluating function-as-a-service models in HPC for large-scale data analysis;

      Evaluating funcX for large-scale data analysis; Supervisor: Linh Truong

    • Human-in-the-loop for connecting ML pipelines and workflow-based data analysis in large-scale computing;

      Evaluating, designing and developing connectors to human-based analytics/decisions in ML/data analysis workflows; Supervisor: Linh Truong

    • GPU-based containers Deployment and Management for ML Training in HPC;

      Explore how to manage a virtual cluster of GPU-based containers atop an HPC for ML tasks (e.g., practical work with Singularity containers for MLPerf HPC Training in CSC Puhti); Supervisor: Linh Truong

    • Evaluating patterns for optimizing data locality in HPC Nodes large-scale data analysis/ML tasks;

      Tasks executed in an HPC node need access data but accessing data from the shared file system (e.g., Lustre) might be slow. Patterns for data coupling and movement between HPC storage/outside cloud and local nodes will be evaluated and suggested (e.g. with different HPC systems in Aalto/CSC); Supervisor: Linh Truong

    • Building, using, and optimizing a stencil-based application using a CUDA-MPI library with a domain-specific language on graphics processing units (GPUs);

      see https://bitbucket.org/jpekkila/astaroth/src/master/ ;Supervisor: Maarit J. Käpylä

    • Visualising and analysing large-scale real-world data in supercomputing environments;

      Supervisor: Maarit J. Käpylä; Bring your own data or use astronomical data either from observations and simulations!