Episode 6

Just enough CUDA to be dangerous

Ever wanted to learn about CUDA but not sure where to start? In this sixteen minute episode I try to jam in as much CUDA knowledge as could be reasonably expected in a podcast. You won't know how to write a kernel after this episode, but you'll know about what a GPU is, what the general CUDA programming model is, why asynchronous execution makes everything complicated, and some general principles PyTorch abides by when designing CUDA kernels.

PyTorch Developer Podcast

May 11, 202116m 32s

Audio is streamed directly from the publisher (cdn.simplecast.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Original episode page

Show Notes

Further reading:

PyTorch docs on CUDA semantics https://pytorch.org/docs/stable/notes/cuda.html
The book I was recommended for learning CUDA when I first showed up at PyToch: Programming Massively Parallel Processors https://www.amazon.com/Programming-Massively-Parallel-Processors-Hands/dp/0128119861
The environment variable that makes CUDA synchronous is CUDA_LAUNCH_BLOCKING=1. cuda-memcheck is also useful for debugging CUDA problems https://docs.nvidia.com/cuda/cuda-memcheck/index.html

← All episodes of PyTorch Developer Podcast