Embedded Machine Learning

Course Overview

Modern GPUs are powerful high-core-count processors. They are no longer used solely for graphics applications, but are also employed to accelerate computationally intensive general-purpose tasks. In this course, we will look in detail at the GPU’s internal architecture, the differences to general-purpose processors like CPUs, and how to program GPUs. Powerful GPUs are available for exercises and experiments. If we have enough time, we may consider a short introduction to IPUs (Intelligence Processing Units) as well.


  • GPU architecture
  • CUDA programming
  • Parallel programming
  • Scheduling/code/shared memory optimizations
  • Introduction to multi-GPU programming
  • Advanced GPU architecture
  • Introduction to OpenCL & OpenACC


Recommended is solid knowledge of C/C++ and the basics of computer architecture.


  • Course starts Oct 17 14:00 c.t.
  • Start of exercise is to be negotiated with the teaching assistant.
  • Room is OMZ/INF350 basement, U014. Enter the building from the east. If you don’t see a ZITI sign, you are probably at a wrong place.