Episode 23

Code generation

Why does PyTorch use code generation as part of its build process? Why doesn't it use C++ templates? What things is code generation used for? What are the pros/consof using code generation? What are some other ways to do the same things we currently do with code generation?

PyTorch Developer Podcast

June 4, 202116m 51s

Audio is streamed directly from the publisher (cdn.simplecast.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Original episode page

Show Notes

Further reading.

Top level file for the new code generation pipeline https://github.com/pytorch/pytorch/blob/master/tools/codegen/gen.py
Out of tree external backend code generation from Brian Hirsh: https://github.com/pytorch/xla/issues/2871
Documentation for native_functions.yaml https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/README.md (have you seen this README before? Yes you've seen this README before. Imma post it again.)

Outline:

High level: reduce the amount of code in PyTorch, easier to develop
Strongly typed python
Stuff we're using codegen for
- Meta point: stuff c++ metaprogramming can't do
- C++ apis (functions, methods on classes)
  - Especially for forwarding (operator dot doko)
  - Prototypes for c++ to implement
- YAML files used by external frameworks for binding (accidental)
- Python arg parsing
- pyi generation
- Autograd classes for saving saved data
- Otherwise complicated constexpr computation (e.g., parsing JIT
  schema)
Pros
- Better surface syntax (native_functions.yaml, jit schema,
  derivatives.yaml)
- Better error messages (template messages famously bad)
- Easier to organize complicated code; esp nontrivial input
  data structure
- Easier to debug by looking at generated code
Con
- Not as portable (template can be used by anyone)
- Less good modeling for C++ type based metaprogramming (we've replicated a crappy version of C++ type system in our codegen)
Counterpoints in the design space
- C++ templates: just as efficient
- Boxed fallback: simpler, less efficient
Open question: can you have best of both worlds, e.g., with partially evaluated interpreters?

← All episodes of PyTorch Developer Podcast