Episode 15

Shared memory

What is shared memory? How is it used in your operating system? How is it used in PyTorch? What's shared memory good for in deep learning? Why use multiple processes rather than one process on a single node? What's the point of PyTorch's shared memory manager? How are allocators for shared memory implemented? How does CUDA shared memory work? What is the difference between CUDA shared memory and CPU shared memory? How did we implement safer CUDA shared memory?

PyTorch Developer Podcast

May 24, 202110m 45s

Audio is streamed directly from the publisher (cdn.simplecast.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Original episode page

Show Notes

Further reading.

Implementations of vanilla shared memory allocator https://github.com/pytorch/pytorch/blob/master/aten/src/TH/THAllocator.cpp and the fancy managed allocator https://github.com/pytorch/pytorch/blob/master/torch/lib/libshm/libshm.h
Multiprocessing best practices describes some things one should be careful about when working with shared memory https://pytorch.org/docs/stable/notes/multiprocessing.html
More details on how CUDA shared memory works https://pytorch.org/docs/stable/multiprocessing.html#multiprocessing-cuda-sharing-details

← All episodes of PyTorch Developer Podcast