Credits: 5

Schedule: 15.04.2019 - 24.05.2019

Contact information for the course (applies in this implementation): 

Please see

Our primary communication channel is Slack — if you have any questions, please try to ask our course staff there! If you have difficulties joining Slack or for some other reason it does not work for you, please email Jukka Suomela.

Teaching Period (valid 01.08.2018-31.07.2020): 

V (Spring)

Learning Outcomes (valid 01.08.2018-31.07.2020): 

After this course, you will know how to write computationally intensive C or C++ code that makes an efficient use of dozens of CPU cores. You will learn how to partition large-scale computations between multiple processor cores, and how to choose the best memory layout for your data structures. You will also get hands-on experience of offloading computations from CPUs to GPUs. You will learn new kinds of algorithm design techniques that are relevant in the context of parallel computers, and you will also learn which of these techniques actually work in practice on modern multicore CPUs and GPUs.

Content (valid 01.08.2018-31.07.2020): 

This is a practical hands-on course on algorithm engineering for modern parallel computers. The students will learn how to design programs that make the best possible use of  the computing power of multicore CPUs and GPUs. The course projects will cover both numerical and combinatorial problems; the sole objective is to solve the task at hand in the shortest  possible time. We will learn a whole range of techniques for speeding up computations, from bit manipulation hacks and special CPU instructions to high-level techniques such as choosing the right memory layout that makes the best possible use of the cache hierarchy. The main tools that we will use are C or C++, OpenMP or Intel TBB, and OpenCL or CUDA.

Details on the course content (applies in this implementation): 

Students will learn:

  • How to do multicore CPU programming (multithreading, OpenMP)
  • How to exploit instruction-level parallelism
  • How to use vector instructions (SIMD, AVX)
  • How to program GPUs (CUDA)
  • How to benefit from data reuse in registers (CPU and GPU), caches (CPU), and shared memory (GPU)
  • How to choose the right memory access pattern for CPU and GPU code
  • How to benchmark and identify performance bottlenecks
We will also discuss some more advanced material:

  • How to read assembly code produced by the compiler
  • How to use hardware and software prefetching
We will use Linux environment, GCC, and Git.

Assessment Methods and Criteria (valid 01.08.2018-31.07.2020): 

Programming exercises.

Elaboration of the evaluation criteria and methods, and acquainting students with the evaluation (applies in this implementation): 

Solve programming exercises, correctly and efficiently, and return your solutions on time via GitHub. There are both “recommended exercises” and “challenging exercises”. If you solve all recommended exercises correctly and sufficiently efficiently, you can get up to 77 points. The grade thresholds are:

  • 38 points: grade 1/5
  • 45 points: grade 2/5
  • 51 points: grade 3/5
  • 58 points: grade 4/5
  • 64 points: grade 5/5
For the full list of exercises, see

Details on calculating the workload (applies in this implementation): 

5 credits / 6 weeks ≈ 22 hours / week:

  • lecture: 2 hours/week
  • exercise sessions: 0–4 hours/week
  • solving exercises and self-study: 16–20 hours/week

Study Material (valid 01.08.2018-31.07.2020): 

Available online.

Details on the course materials (applies in this implementation): 

Available at

Substitutes for Courses (valid 01.08.2018-31.07.2020): 

ICS-E4020 Programming Parallel Computers

Prerequisites (valid 01.08.2018-31.07.2020): 

No prior knowledge of parallel programming is needed. Students should have a good understanding of computer programming, algorithms and data structures, and a working knowledge of either C or C++ programming language. While this course is primarily targeted to Master students, advanced Bachelor students are welcome to join if they have sufficient background knowledge and programming skills. At the minimum, students should have completed all 1st year and 2nd year courses of their Bachelor degree.

Grading Scale (valid 01.08.2018-31.07.2020): 


Details on the schedule (applies in this implementation): 

The important deadlines are:

  • Exercises: every Sunday at 23:59 in GitHub.
  • Prerequisite test: Friday, 19 April 2019, at noon in A+.

Please see for the list of all exercises and their deadlines, and for the recommended path.


Registration and further information