Episode 22

Why is autograd so complicated

Why is autograd so complicated? What are the constraints and features that go into making it complicated? What's up with it being written in C++? What's with derivatives.yaml and code generation? What's going on with views and mutation? What's up with hooks and anomaly mode? What's reentrant execution? Why is it relevant to checkpointing? What's the distributed autograd engine?

PyTorch Developer Podcast

June 3, 202115m 45s

Audio is streamed directly from the publisher (cdn.simplecast.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Original episode page

Show Notes

Further reading.

Autograd notes in the docs https://pytorch.org/docs/stable/notes/autograd.html
derivatives.yaml https://github.com/pytorch/pytorch/blob/master/tools/autograd/derivatives.yaml
Paper on autograd engine in PyTorch https://openreview.net/pdf/25b8eee6c373d48b84e5e9c6e10e7cbbbce4ac73.pdf

← All episodes of PyTorch Developer Podcast