How to deploy deep learning and traditional machine learning models using ONNX 07/04 Update SLTechnology News&Howtos

How to deploy deep learning and traditional machine learning models using ONNX

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

How to use ONNX to deploy deep learning and traditional machine learning model, aiming at this problem, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.

Introduction to ONNX

Open Neural Network Exchange ONNX (Open Neural Network Exchange) is an open format for representing deep neural network models, which was introduced by Microsoft and Facebook in 2017 and quickly supported by major vendors and frameworks. After a few years of development, it has become a practical standard for expressing deep learning models, and through ONNX-ML, it can support traditional non-neural network machine learning models and unify the entire AI model exchange standard.

ONNX defines a set of standard formats independent of environment and platform, which provides the basis for the interoperability of the AI model, so that the AI model can be used interactively in different frameworks and environments. Hardware and software vendors can optimize model performance based on the ONNX standard for the benefit of all ONNX-compliant frameworks. At present, ONNX mainly focuses on model prediction (inferring). Models trained by different frameworks can be easily deployed in an ONNX-compatible running environment after being converted into ONNX format.

Introduction of ONNX Standard

The ONNX specification consists of the following parts:

An extensible computational graph model: a general intermediate representation of computational graph (Intermediate Representation) is defined.

Built-in operator set: ai.onnx and ai.onnx.ml,ai.onnx are default operator sets, mainly for neural network models, while ai.onnx.ml is mainly suitable for traditional non-neural network machine learning models.

Standard data type. Including tensor (tensors), sequence (sequences) and mapping (maps).

Currently, there are two official variants of the ONNX specification, the main difference being the supported types and the default set of operators. ONNX neural network variants only use tensors as inputs and outputs, while ONNX-ML, which supports traditional machine learning models, can also identify sequences and mappings. ONNX-ML extends the set of ONNX operators to support non-neural network algorithms.

ONNX uses protobuf to serialize the AI model, and at the top level is a model (Model) structure, mainly composed of associated metadata and a graph (Graph). The graph consists of metadata, model parameters, input and output, and a sequence of computing nodes (Node). These nodes form a computational acyclic graph. Each computing node represents an operator call, which is mainly composed of node name, operator, input list, output list and attribute list. The attribute list mainly records some run-time constants, such as the coefficient values generated during model training.

To get a more intuitive understanding of the content in ONNX format, let's train a simple LogisticRegression model and then export ONNX. Still use the commonly used classified dataset iris:

From sklearn.datasets import load_irisfrom sklearn.model_selection import train_test_splitfrom sklearn.linear_model import LogisticRegressioniris = load_iris () X, y = iris.data, iris.targetX_train, X_test, y_train, y_test = train_test_split (X, y) clr = LogisticRegression () clr.fit (X_train, y_train)

Use skl2onnx to serialize the Scikit-learn model into ONNX format:

From skl2onnx import convert_sklearnfrom skl2onnx.common.data_types import FloatTensorTypeinitial_type = [('float_input', FloatTensorType ([1,4])] onx = convert_sklearn (clr, initial_types=initial_type) with open ("logreg_iris.onnx", "wb") as f: f.write (onx.SerializeToString ())

Use ONNX Python API to view and validate the model:

Import onnxmodel = onnx.load ('logreg_iris.onnx') print (model)

Output model information is as follows:

Ir_version: 5producer_name: "skl2onnx" producer_version: "1.5.1" domain: "ai.onnx" model_version: 0doc_string: "graph {node {input:" float_input "output:" label "output:" probability_tensor "name:" LinearClassifier "op_type:" LinearClassifier "attribute {name:" classlabels_ints "ints: 0 ints: 1 Ints: 2 type: INTS} attribute {name: "coefficients" floats: 0.375753253698349 floats: 1.3907358646392822 floats:-2.127762794494629 floats:-0.9207873344421387 floats: 0.47902926802635193 floats:-1.55242502636157 floats: 0.46959221363067627 floats:-1.2708676769695747 floats:-1.5656673908233643 floats:-1.256540060043335 floats: 2.18996000289917 floats 2.2694246768951416 type: FLOATS} attribute {name: "intercepts" floats: 0.24828049540519714 floats: 0.841576278206863 floats:-1.0461325645446777 type: FLOATS} attribute {name: "multi_class" I: 1 type: INT} attribute {name: "post_transform" s: "LOGISTIC" type: STRING} Domain: "ai.onnx.ml"} node {input: "probability_tensor" output: "probabilities" name: "Normalizer" op_type: "Normalizer" attribute {name: "norm" s: "L1" type: STRING} domain: "ai.onnx.ml"} node {input: "label" output: "output_label" name : "Cast" op_type: "Cast" attribute {name: "to" I: 7 type: INT} domain: ""} node {input: "probabilities" output: "output_probability" name: "ZipMap" op_type: "ZipMap" attribute {name: "classlabels_int64s" ints: 0 ints: 1 ints: 2 type: INTS} domain: "ai.onnx.ml"} name: "deedadd605a34d41ac95746c4feeec1f" input {name: "float_input" type {tensor_type {elem_type: 1 shape {dim {dim_value: 1} dim {dim_value: 4} } output {name: "output_label" type {tensor_type {elem_type: 7 shape {dim {dim_value: 1}} output {name: "output_probability" type {sequence_type {elem_type { Map_type {key_type: 7 value_type {tensor_type {elem_type: 1}} opset_import {domain: "" version: 9} opset_import {domain: "ai.onnx.ml" version: 1}

We can see that the top-level field records some metadata information of the model, which represents a more intuitive meaning. For a detailed explanation of the field, please refer to the document Open Neural Network Exchange-ONNX. Opset_import records the set of operators introduced by the model. An empty set of domain operators represents the introduction of the default operator set ai.onnx for ONNX. Ai.onnx.ml represents a set of operators that support traditional non-neural network models, such as LinearClassifier, Normalizer, and ZipMap in the above models. The following elements are defined in the figure (graph):

Four compute nodes (node).

An input variable float_input, a tensor of type 1 to 4, elem_type is an DataType enumerated variable, and 1 represents FLOAT.

The two output variables output_label and output_probability,output_label INT64 (elem_type: 7) tensor with dimension 1 represent the classification of predicted targets, while the type of output_probability is a sequence of maps, the key of the mapping is INT64 (key_type: 7), and the value of FLOAT with dimension 1 represents the probability of each target classification.

You can use netron to visualize the computational topology diagram of the ONNX model, as shown in the following figure:

Let's use ONNX Runtime Python API to predict the ONNX model, currently using only the first piece of data in the test dataset:

Import onnxruntime as rtimport numpysess = rt.InferenceSession ("logreg_iris.onnx") input_name = sess.get_inputs () [0] .namelabel _ name = sess.get_outputs () [0] .nameplate _ name = sess.get_outputs () [1]. Namepred_onx = sess.run ([label_name, probability_name] {input_name: X_test [0] .astype (numpy.float32)}) # print infoprint ('input_name:' + input_name) print ('label_name:' + label_name) print ('probability_name:' + probability_name) print (X_test [0]) print (pred_onx)

The printed model information and predictions are as follows:

Input_name: float_inputlabel_name: output_labelprobability_name: output_probability [5.5 2.6 4.4 1.2] [array ([1], dtype=int64), [0: 0.012208569794893265, 1: 0.5704444646835327, 2: 0.4173469841480255}]]

For a complete program, you can refer to the following Notebook:onnx.ipynb

ONNX and PMML

ONNX and PMML are platform-independent and environment-independent model representation standards, which can separate the model deployment from the model training environment, simplify the deployment process, and accelerate the rapid launch of the model to the production environment. These two standards are supported by major manufacturers and frameworks and have a wide range of applications.

PMML is a relatively mature standard. Before the birth of ONNX, it can be said to be the actual standard of model representation, which has rich support for traditional data mining models. The latest PMML4.4 can support up to 19 model types. However, at present, PMML lacks support for deep learning model, and the next version 5.0 may add support for deep neural network. But because PMML is based on the old XML format, using text format to store the structure and parameters of deep neural network model will bring model size and performance problems. At present, there is not a perfect solution to this problem. For a detailed introduction to PMML, you can refer to the article "deploying a machine learning model using PMML".

As a new standard, ONNX mainly provides support for deep neural network model at the beginning to solve the problem of model interoperation and exchange in different frameworks. At present, traditional non-neural network machine learning models can be supported by ONNX-ML,ONNX, but the types of models are not rich enough. ONNX uses the protobuf binary format to serialize the model to provide better transport performance.

Both ONNX and PMML formats are supported by mature open source libraries and frameworks, and PMML has JPMML,PMML4S,PyPMML and so on. ONNX has Microsoft's ONNX runtime,NVIDIA TensorRT and so on. Users can choose the appropriate cross-platform format to deploy the AI model according to their actual situation.

Introduction to DaaS

DaaS (Deployment-as-a-Service) is an AI model automatic deployment system released by AutoDeployAI, which supports on-line deployment of various model types. Here we introduce how to deploy traditional machine learning models and deep neural network learning models using ONNX format in DaaS. DaaS uses ONNX Runtime as the execution engine of ONNX models, and ONNX Runtime is Microsoft's open source ONNX prediction class library, which provides high-performance prediction services. First of all, after logging in to the DaaS system, create a new project ONNX, the following operations are carried out under the project. For more information about DaaS, you can refer to the article "automatically deploying the PMML model to generate REST API."

Deploy traditional machine learning models using ONNX

Import the model. Select the Logistic Regression model logreg_iris.onnx trained above:

After the import is successful, the page goes to the model main page. You can see that the model has an input field float_input of type tensor (float) and dimension (1pm 4). Two output fields: output_label and output_probability.

Test the model. Click the tab test, enter the prediction data [[5.5, 2.6, 4.4, 1.2]], then click the submit command, and the output page displays the prediction test results:

Create a default real-time prediction Web service. Click the tab to deploy, then click the add Service command, enter the service name, and use the default values for others:

Test the Web service. After the service is successfully created, the page goes to the service deployment home page. When the service copy status is running, it means that the Web service has been launched successfully and can accept external requests. There are two ways to test the service:

Pass the test page in the DaaS system. Click the tab to test, enter the request text in JSON format, and click the submit command:

Through any RSET client, use the standard REST API to test. Here, we use the curl command line program to invoke the Web service. Click the generate code command, and the dialog box that calls REST API using the curl command pops up:

Copy the curl command, open the shell page, and execute the command:

Deployment of Deep Neural Network Model using ONNX

We try to deploy the trained model in ONNX Model Zoo. Here we choose the MNIST- handwritten digit recognition CNN model and download the latest version of the ONNX1.3-based model: mnist.tar.gz.

Import the model. Select the downloaded model mnist.tar.gz:

After the import is successful, the page goes to the model main page. You can see that the model has an input field Input3, the type is tensor (float), and the dimension (1mem1pj28). An output field: Plus214_Output_0, type is also tensor (float), dimension (1mem10).

Test the model. Click on the tab to test, and then click the JSON command, and the DaaS system will automatically create random data that matches the input data format to facilitate testing. Click the submit command, and the output page displays the prediction test results:

Create a custom implementation prediction script. In order to support input images and output predicted values directly, we need to create a custom prediction script. Click the tab for real-time prediction, and then click the command to generate a custom real-time prediction script

After the script is generated, click the command as an API test to enter the script test page, we are free to add custom pre-processing and post-processing functions. Add the following function to preprocess the image:

Def rgb2gray (rgb): "" Convert the input image into grayscale "import numpy as np return np.dot (rgb [...,: 3], [0.299, 0.587, 0.114]) def resize_img (img_to_resize):" Resize image to MNIST model input dimensions "import cv2 r_img = cv2.resize (img_to_resize, dsize= (28,28), interpolation=cv2.INTER_AREA) r_img.resize ((1,1,28) 28) return r_imgdef preprocess_image (img_to_preprocess): "Resize input images and convert them to grayscale." If img_to_preprocess.shape = (28,28): img_to_preprocess.resize ((1,1,28,28)) return img_to_preprocess grayscale = rgb2gray (img_to_preprocess) processed_img = resize_img (grayscale) return processed_img

Call preprocess_image in the existing preprocess_files function as follows:

Import matplotlib.image as mpimgfor key, file in files.items (): img = mpimg.imread (file) record [key] = preprocess_image (img)

After adding the following code to the existing postprocess function, process the prediction result to get the final prediction value:

Def postprocess (result): "postprocess the predicted results" import numpy as np return [int (np.argmax (np.array (result). Squeeze (), axis=0))]

Click the command to save, then enter the function name as predict on the request page, select the request body based on the form, enter the form name as the unique input field name of the model Input3, select the type file, click upload, select the test image 2.png, and finally click the submit command to test whether the script works as expected:

Create a formal deployment Web service. When the script test is successful, click the deployment tab, then click the add Network Service command, enter the service name, and use the default values for others:

Pass the test page in the DaaS system. Click the tab to test, select the request body based on the form, select input test image 5.jpg, and click the submit command:

Copy the curl command, open the shell page, change to the image directory, and execute the command:

This is the answer to the question about how to use ONNX to deploy deep learning and traditional machine learning model. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel for more related knowledge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.