
Serialization
What is serialization? Why do I care about it? How is serialization done in general in Python? How does pickling work? How does PyTorch implement pickling for its objects? What are some pitfalls of pickling implementation? What does backwards compatibility and forwards compatibility mean in the context of serialization? What's the difference between directly pickling and using torch.save/load? So what the heck is up with JIT/TorchScript serialization? Why did we use zip files? What were some design principles for the serialization format? Why are there two implementations of serialization in PyTorch? Is the fact that PyTorch uses pickling for serialization mean that our serialization format is insecure?
Audio is streamed directly from the publisher (cdn.simplecast.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
What is serialization? Why do I care about it? How is serialization done in general in Python? How does pickling work? How does PyTorch implement pickling for its objects? What are some pitfalls of pickling implementation? What does backwards compatibility and forwards compatibility mean in the context of serialization? What's the difference between directly pickling and using torch.save/load? So what the heck is up with JIT/TorchScript serialization? Why did we use zip files? What were some design principles for the serialization format? Why are there two implementations of serialization in PyTorch? Is the fact that PyTorch uses pickling for serialization mean that our serialization format is insecure?
Further reading.
- TorchScript serialization design doc https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/docs/serialization.md
- Evolution of serialization formats over time https://github.com/pytorch/pytorch/issues/31877
- Code pointers:
- Tensor
__reduce_ex__https://github.com/pytorch/pytorch/blob/de845020a0da39e621db984515bc1cce03f526ea/torch/_tensor.py#L97-L178 - Python side serialization https://github.com/pytorch/pytorch/blob/de845020a0da39e621db984515bc1cce03f526ea/torch/serialization.py#L384-L499
- C++ side serialization https://github.com/pytorch/pytorch/tree/master/torch/csrc/jit/serialization
- Tensor