
Functionalization
Functionalization is the process by which we remove mutation from autograd graphs in PyTorch, leaving us with a purely functional graph that we can execute in the normal way. Why do we need to do functionalization? What makes it not so easy to do? How do we do it? And how does it compare to mutation removal that you might see in a compiler?
Audio is streamed directly from the publisher (cdn.simplecast.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
Functionalization is the process by which we remove mutation from autograd graphs in PyTorch, leaving us with a purely functional graph that we can execute in the normal way. Why do we need to do functionalization? What makes it not so easy to do? How do we do it? And how does it compare to mutation removal that you might see in a compiler?
Further reading:
- Section 3.1 of this paper on PyTorch AD https://openreview.net/pdf/25b8eee6c373d48b84e5e9c6e10e7cbbbce4ac73.pdf predates our implementation of inplace autograd but accurately reports the subtleties and correctly predicts the implementation strategy we ended up taking
- RFC to generalize the functionalization mechanism to be available to arbitrary backends https://github.com/pytorch/rfcs/pull/19
- Code that handles lazily updating views when the base is updated https://github.com/pytorch/pytorch/blob/e5e095cbe4dbc5a601f98e6134dcbd59c6342d7d/torch/csrc/autograd/variable.cpp#L556-L603