At the PyTorch Conference on December 2, 2022, the PyTorch team announced a major update to the PyTorch ecosystem: also known as PyTorch 2.0 this update prioritizes the flexibility that PyTorch users care deeply about, while providing PyTorch developers with infrastructure improvements targeted at optimizing performance with compiler technologies. As developers of Apache TVM and members of the Apache TVM open source community, we at OctoML are especially excited about PyTorch 2.0 and the opportunities it brings for Apache TVM and PyTorch to provide a seamlessly integrated product and user experience. In this blog post, we will share some new user workflows in PyTorch 2.0, discuss the compatibility of existing user workflows in PyTorch 1.X, and discuss how all of these user workflows interoperate with Apache TVM.
PyTorch 2.0 + TVM User Workflows
OctoML is investing in a PyTorch 2.0 + Apache TVM integration because we believe that a PyTorch 2.0 integration will provide users with a low-code approach for utilizing Apache TVM in their machine learning R&D, and that Apache TVM’s ecosystem of optimizations across a variety of workloads and hardware targets will be beneficial to PyTorch users. Apache TVM utilizes a new technology introduced in PyTorch 2.0 known as TorchDynamo, which empowers PyTorch users to connect PyTorch graphs of any size to compilers like Apache TVM, all with just a single decorator in Python. Below, we show an example of how simple it will be to use PyTorch 2.0 + TVM with an existing ML model:
# PyTorch + TVM full model example. @dynamo.optimize(tvm_compiler) class MyModel(torch.nn.Module): """ Put an existing ML model definition here. No code changes are required within the model itself to enable TorchDynamo compilation via Apache TVM. """
@dynamo.optimize(tvm_compiler) tells PyTorch and TorchDynamo to use Apache TVM to compile the ML model.
PyTorch 1.X + TVM Existing User Workflows
PyTorch community members who are already using PyTorch 1.X may be wondering whether their existing workflows will be impacted by the recent announcement. While PyTorch has stated that backwards compatibility is a priority, users of the existing PyTorch 1.X workflows may choose to migrate towards a PyTorch 2.0-based system due to the performance and user experience improvements gained from the transition. In this section, we’ll take a look at two common workflows and how they’re evolving: eager execution and model export.
- Eager Execution: PyTorch’s eager execution has been a central part of many users’ workflows since the creation of the project. PyTorch Eager acts like a Python interpreter, immediately evaluating a user’s PyTorch code as it runs. While this is a flexible interface for users who are actively developing and changing their models, it is quite slow and inefficient for production use cases.
- Model Export: PyTorch currently offers a model export capability known as TorchScript, which makes the model more predictable for production deployment scenarios. The TorchScript export process freezes the model at a single time point and serializes it, enabling a user to reload the state of the model in the future. While this is a useful functionality for models in production, PyTorch’s research has shown that TorchScript fails to express many different types of valid graphs within the PyTorch ecosystem. Furthermore, when users need to update their models, the model export process could contribute to significant and repeated development overhead.
When PyTorch 2.0 is released, we anticipate that some PyTorch eager execution users will try the new TorchDynamo-based workflows — especially if they only need to add a single decorator to their existing code to enable all the benefits that TorchDynamo provides. Additionally, PyTorch has preliminarily defined a model export interface through TorchDynamo to be released in PyTorch 2.0, which means that users who utilize the PyTorch 2.0 + Apache TVM integration via TorchDynamo will still have the ability to export their ML workloads for production deployment use cases.
Apache TVM is already available as a PyTorch 2.0 compilation option through TorchDynamo today. As the PyTorch community works hard on bringing the stable PyTorch 2.0 release to users in March 2023, OctoML and the Apache TVM community will continue to build out additional support and tighter integration for listed PyTorch 2.0 features. We’re also working on an interactive demo to showcase the integration between PyTorch 2.0 and Apache TVM!
OctoML is specifically interested in developing future PyTorch integrations centered around:
- TorchInductor: Another PyTorch 2.0 technology that’s recently been announced is TorchInductor. It is a compiler which operates at the loop level (as opposed to TorchDynamo, which operates at the graph and subgraph level). In applications with tighter performance requirements, this finer-grained integration between PyTorch 2.0 and TVM will enable even more control over TVM within the PyTorch ecosystem. Additionally, an integration with TorchInductor will enable TVM to interoperate with other TorchInductor backends, such as OpenAI's Triton.
- Custom Operators: While custom operators have always been an integral part of the PyTorch ecosystem, the innovations in PyTorch 2.0 (specifically TorchDynamo and TorchInductor) enable Apache TVM to create tighter integrations with PyTorch users’ existing custom operators.