PLAY PODCASTS
torch.nn
Episode 24

torch.nn

What goes into the implementation of torch.nn? Why do NN modules exist in the first place? What's the function of Parameter? How do modules actually track all the parameters in question? What is all of the goop in the top level NN module class? What are some new developments in torch.nn modules? What are some open problems with our modules?

PyTorch Developer Podcast

June 7, 202114m 18s

Audio is streamed directly from the publisher (cdn.simplecast.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

What goes into the implementation of torch.nn? Why do NN modules exist in the first place? What's the function of Parameter? How do modules actually track all the parameters in question? What is all of the goop in the top level NN module class? What are some new developments in torch.nn modules? What are some open problems with our modules?

Further reading:

Liner notes:

  • python for hackability (c++ is reimplemented)
  • parameters
    • parameter collection (for optimization)
    • buffers: not considered optimizable
  • modules
    • functorial operation (_apply)
    • jit script: staged computation (init is not scripted)
    • __call__ to forward (extra instrumentation)
    • serialization / state_dict
  • new stuff: device kwarg (joel schlosser)
  • new stuff: lazy modules (emcastillo)
  • open problems: parameter initialization