Hide Code, Minimize Dependencies, Boost Performance - The PyTorch JIT
Tilman Krokotsch
During the last few years, PyTorch emerged as one of the most popular frameworks for deep learning research. Through its "define-by-run" paradigm and pythonesque design it encourages clean, object-oriented code and enables easy debugging by the standard Python debugger.
Even though these design choices are helpful in research, they impose significant disadvantages on using the model after training is done. Loading your model into an application that might even be written in another programming language is coupled with code refactoring and additional testing.
To alleviate these problems, last year PyTorch received a just-in-time compiler (JIT), able to convert your network into a graph-based representation like in TensorFlow. Models that are exported via JIT hide your code, minimize code dependencies and can be loaded into the C++ API of PyTorch. Additionally, JIT optimizes the model code for performance, boosting inference speed.
In this talk we will have a look at the PyTorch JIT and how to use it. We will investigate its capabilities and limitations on a hands-on example, going over good design choices for your network and how to load your trained model into a C++ application.