Embedded Machine Learning

Course Overview

Modern GPUs are powerful high-core-count processors. They are no longer used solely for graphics applications, but are also employed to accelerate computationally intensive general-purpose tasks. In this course, we will look in detail at the GPU’s internal architecture, the differences to general-purpose processors like CPUs, and how to program GPUs. Powerful GPUs are available for exercises and experiments. If we have enough time, we may consider a short introduction to IPUs (Intelligence Processing Units) as well.

Contents

  • GPU architecture
  • CUDA programming
  • Parallel programming
  • Scheduling/code/shared memory optimizations
  • Introduction to multi-GPU programming
  • Advanced GPU architecture
  • Introduction to OpenCL & OpenACC

Requirements

Recommended is solid knowledge of C/C++ and the basics of computer architecture.

Notes

  • Frequency: winter term
  • Next edition of this course is scheduled for winter 2024/25