How to deploy the PyTorch Lightning model to production 07/19 Update SLTechnology News&Howtos

How to deploy the PyTorch Lightning model to production

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article mainly explains "how to deploy the PyTorch Lightning model to production". The content of the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "how to deploy the PyTorch Lightning model to production".

Every method of deploying the PyTorch Lightning model for reasoning

There are three ways to export a PyTorch Lightning model for release:

Save the model as a PyTorch checkpoint

Convert the model to ONNX

Export the model to Torchscript

We can serve these three through Cortex.

1. Package and deploy PyTorch Lightning modules directly

Starting with the easiest way, let's deploy a PyTorch Lightning model without any transformation steps.

PyTorch Lightning Trainer is a class that abstracts boilerplate training code (thinking about training and validation steps) with a built-in save_checkpoint () function that saves your model as a .ckpt file. To save the model as a checkpoint, simply add the following code to the training script:

Now, before we start serving this checkpoint, it's important to note that although I keep saying "PyTorch Lightning model," PyTorch Lightning is a wrapper for PyTorch-the project's README literally means "PyTorch Lightning is just an organized PyTorch." Therefore, the exported model is a common PyTorch model and can be used accordingly.

With the saved checkpoints, we can easily serve the model in Cortex. If you are not familiar with Cortex, you can quickly familiarize yourself with it here, but a simple overview of the Cortex deployment process is:

We used Python to write a predictive API for our model.

We define our API infrastructure and behavior in YAML

We use the commands in CLI to deploy API

Our Forecast API will use Cortex's Python Predictor class to define an init () function to initialize our API and load the model, and use a define () function to provide predictions at query time:

It's simple. We repurposed some of the code from the training code and added some reasoning logic, that's all. One thing to note is that if you upload the model to S3 (recommended), you need to add some logic to access it.

Next, we configure the infrastructure in YAML:

Once again, simple. We give our API a name, tell Cortex where our forecast API is, and assign some CPU.

Next, we deploy it:

Note that we can also deploy to a cluster, accelerated and managed by Cortex:

In all deployments, Cortex will containerize our API and expose it as a Web service. With cloud deployment, Cortex can configure load balancing, automatic extension, monitoring, updates, and many other infrastructure functions.

okay! Now we have a real-time Web API that can provide model predictions on request.

two。 Export to ONNX and drop through the ONNX runtime

Now we have deployed a normal PyTorch checkpoint to make things a little more complicated.

PyTorch Lightning recently added a convenient abstraction for exporting models to ONNX (previously, you could use PyTorch's built-in transformation capabilities, although they require more templates). To export the model to ONNX, simply add the following code to your training script:

Note that your input sample should mimic the shape of the actual model input.

After you export the ONNX model, you can use Cortex's ONNX Predictor to serve it. The code basically looks the same, and the process is the same. For example, this is an ONNX forecast API:

It's basically the same. The only difference is that instead of initializing the model directly, we access this data through onnx_client, the ONNX runtime container that Cortex starts to serve our model.

Our YAML also looks very similar:

I've added a monitor flag here just to show how easy it is to configure, and there are some ONNX-specific fields, but otherwise they are the same YAML.

Finally, we deploy using the same $cortex deploy command as before, and our ONNX API is enabled.

3. Serialization using Torchscript's JIT compiler

For the final deployment, we will export the PyTorch Lightning model to Torchscript and use PyTorch's JIT compiler to provide services. To export the model, simply add it to your training script:

The Python API used for this purpose is almost the same as the original PyTorch example:

The YAML remains the same as before, and the CLI command is of course consistent. If necessary, we can actually update our previous PyTorch API to use the new model, simply replace the old and new dictor.py script with the new script, and then run the $cortex deployment again:

Cortex automatically performs rolling updates here, in which the new API is started and then exchanged with the old API, avoiding any downtime between model updates.

This is all. You now have a fully operable predictive API for real-time reasoning, which can provide predictions based on the Torchscript model.

Thank you for your reading, the above is the content of "how to deploy the PyTorch Lightning model to production". After the study of this article, I believe you have a deeper understanding of how to deploy the PyTorch Lightning model to production, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.