How OneFlow interacts with ONNX 07/12 Update SLTechnology News&Howtos

How OneFlow interacts with ONNX

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article focuses on "how OneFlow interacts with ONNX". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor learn how OneFlow interacts with ONNX.

0x0. Introduction

Before you start reading this article, if you don't know much about ONNX, mind reading these articles about ONNX that I wrote earlier:

A preliminary study of ONNX

Re-exploration of ONNX

Introduction to the new edition of onnx2pytorch and onnx-simplifier

And the big teacher's:

Onnx simplifier and optimizer

Then, instead of exploring ONNX itself, this article will talk about another interesting topic, that is, how the deep learning framework interacts with ONNX. I recently worked with my teacher to do some work related to ONNX based on the OneFlow deep learning framework, and I feel more familiar with the interaction between OneFlow and ONNX. Therefore, in this article I will share the specific implementation of the interaction between OneFlow and ONNX and introduce some of the features of oneflow-onnx as an open source tool. Let the reader know how the OneFlow model is transformed into the ONNX model, and how the ONNX model is transferred back to the OneFlow model (X2OneFlow). Personally, I think that the current interaction between OneFlow and ONNX is more elegant and scalable, so we translate this work into open source results and share implementation ideas, github address: https://github.com/Oneflow-Inc/oneflow_convert_tools. This tool will be continuously maintained by us as part of the OneFlow ecosystem, and it has also been made into a wheel package that can be quickly experienced by interested users only by installing oneflow-onnx on pip. There are also detailed installation steps in the second section below and in the README of the project.

The oneflow-onnx tool includes two functions, one is to export OneFlow to ONNX, and the other is to transform the ONNX model exported by each training framework into an OneFlow model. This project has adapted the pre-training model of the TensorFlow/Pytorch/PaddlePaddle framework to convert ONNX to OneFlow (we call this function X2OneFlow). More usage examples and related documentation and source code are available in the open source https://github.com/Oneflow-Inc/oneflow_convert_tools project.

0x1. Operator support and model support OP list supported by OneFlow2ONNXOneFlow2ONNX

OneFlow2ONNX currently supports 60 + ONNX OP. We have listed all the ONNX OP currently supported by OneFlow in the following list.

Serial number OP serial number OP1GatherND2Transpose3Add4Sub5Mul6Div7Sum8LeakyRelu9Softplus10Softplus11Abs12Ceil13Elu14Exp15Floor16Log17Neg18Sigmoid19Sqrt20Tanh21Reciprocal22Relu23Acos24Asin25Atan26Cos27Sin28Tan29Acosh30Asinh31Atanh32Cosh33Sinh34Min35Max36Clip37Softmax38Sign39MatMul40Erf41FloorMod42Round43Not44And45Or46Equal47NotEqual48Greater49Less50Pad51AveragePool52MaxPool53Conv54QuantizeLinear56ReduceMin57BatchNormalization58ReduceSum59ReduceProd60ArgMax61ArgMin62Reshape63Squeeze64Transpose65Concat66Cast67Identity68MulOneFlow2ONNX model test library

OneFlow2ONNX currently supports 60 + ONNX OP, and we tested the transformation of OneFlow2ONNX in the list of models below.

List of OP supported by the model source operator versionAlexNetOneFlow-AlexNet10MobileNetV2Oneflow-MobileNetV210ResNet50OneFlow-ResNet5010X2OneFlowX2OneFlow

At present, X2OneFlow supports the Tensorflow/Pytorch/PaddlePaddle OP of 40 + ONNX OP,30+, covering most of the common operations of CV classification models. OP's unit test code will gradually move to the project's examples directory and support more OP.

ONNX serial number OP serial number OP1Conv2BatchNormalization3MaxPool4AveragePool5Concat6ReLU7AdaptiveMaxPool8Softmax9Unsqueeze10Transpose11Clip12Gather13Slice14Split15Flatten16Add17Sub18Mul19Div20Sqrt21Pow22Tanh23Sigmoid24Cast25Pad26ReduceMean27Reshape28AdaptiveAvgPool29Squeeze30Expand31Gather32Slice33Split34Min35Max36Constant37HardSigmoid38Gemm39MatMul40Erf41Cast42GlobalMaxPool43GlobalAveragePool44ReduceMax45IdentityTensorFlow serial number OP serial number OP1relu2concatenate3expand_dims4transpose5batchnorm6slice7gather8clip_by_value9conv2d10depthwiseconv2d11flatten12add13sub14mul15div16pow17sqrt18tanh19sigmoid20erf21cast22pad23maxpool24avgpool25globalavgpool26globalmaxpool27reduce_mean28reshape29softmax30relu6

There is a problem with grouped convolution, which has been PR to the TensorFlow2ONNX team.

Pytorch serial number OP serial number OP1relu2cat3unsqueeze4transpose5batchnorm6slice7gather8clamp9conv2d10depthwiseconv2d11flatten12add13sub14mul15div16pow17sqrt18tanh19sigmoid20erf21cast22pad23maxpool24avgpool25globalavgpool26globalmaxpool27reduce_mean28reshape29softmax30relu631CrossEntropyLossPaddlePaddle serial number OP serial number OP1relu2concatenate3expand_dims4transpose5batchnorm6slice7gather8clip_by_value9conv2d10depthwiseconv2d11flatten12add13sub14mul15div16pow17sqrt18tanh19sigmoid20erf21cast22pad23maxpool24avgpool25adaptiveavgpool26adptivemaxpool27reduce_mean28reshape29softmax30relu6

Related issue:

Https://github.com/PaddlePaddle/Paddle2ONNX/issues/221

Https://github.com/PaddlePaddle/Paddle2ONNX/issues/220

X2OneFlow model test library

At present, X2OneFlow supports the Tensorflow/Pytorch/PaddlePaddle OP of 40 + ONNX OP,30+, covering most of the common operations of CV classification models. We tested the transformation of X2OneFlow in the following list of models.

Whether Pytorch model supports AlexNetYesVGGNetYesGoogleNetYesResNetYesResNextYesSENetYesMobileNetV1YesMobileNetV2YesMobileNetV3YesRegNetYesDenseNetYesEfficientNetYesInceptionNetYesShuffleNetV1YesShuffleNetV2YesSqueezeNetYesTensorFlow model whether VGGNetYesResNetYesResNetV2YesXceptionNetYesMobileNetV1YesMobileNetV2YesMobileNetV3YesDenseNetYesEfficientNetYesInceptionNetYesPaddlePaddle model supports AlexNetYesVGGNetYesGoogleNetYesResNetYesResNextYesSE_ResNextYesSENetYesMobileNetV1YesMobileNetV2YesMobileNetV3YesRegNetYesDenseNetNo (msg: "op_name: Concat_58 already exist in job: job_eval") EfficientNetYesInceptionNetYesShuffleNetV2YesSqueezeNetYesDPNNetYesDarkNetYesGhostNetYesRepVGGYesXceptionNetYesXception_DeepLabYesVision_TransformerNo ("op_name: Constant_20 already exist in job: job_eval") Res2NetNo (split op bug,working) UnetNo (OneFlow upsampling OP and Paddle are not aligned)

The test code of the model can be found in the examples of the project.

0x2. Quick experienc

User environment configuration

Python > = 3.5onnx > = 1.8.0onnx-simplifier > = 0.3.3onnxoptimizer > = 0.2.5onnxruntime > = 1.6.0oneflow > = 0.3.4

If you want to use X2OneFlow (X stands for TensorFlow/Pytorch/PaddlePaddle), you need to install the corresponding deep learning framework, which depends on the following:

Pytorch > = 1.7.0paddlepaddle > = 2.0.0tensorflow > = 2.0.0 installation

Installation mode 1

Pip install oneflow_onnx installation mode 2git clone https://github.com/Oneflow-Inc/oneflow_convert_toolscd oneflow_onnxpython3 setup.py install

For more information on how to use it, see the example under samples of the project.

0x3. OneFlow-ONNX idea sharing

In this section we will share how OneFlow's model is transformed into ONNX. Here we analyze the source code by exporting AlexNet defined by OneFlow to ONNX model as an example. First of all, when we enter here with https://github.com/Oneflow-Inc/oneflow_convert_tools/blob/main/examples/oneflow2onnx/test_alexnet.py#L133, we can see the following calling code:

Def test_alexnet (): @ flow.global_function () def alexnet_eval_job (x: tp.Numpy.Placeholder ((1,227,227,3)): return alexnet (x, None, False) convert_to_onnx_and_check (alexnet_eval_job, flow_weight_dir=None, onnx_model_path= "/ tmp")

Here, an AlexNet job for prediction for eval is defined through flow.global_function (). The complete definition of the network can be accessed through the link above. You can see that the AlexNet defined by OneFlow is transformed into an ONNX model through the convert_to_onnx_and_check function. We follow this function and come here: https://github.com/Oneflow-Inc/oneflow_convert_tools/blob/main/oneflow_onnx/oneflow2onnx/util.py#L65-L73 The code is:

While not os.path.exists (os.path.join (flow_weight_dir, "snapshot_done"): pass onnx_model_dir = onnx_model_path onnx_model_path = os.path.join (onnx_model_dir, "model.onnx") flow.onnx.export (job_func, flow_weight_dir, onnx_model_path, opset=opset, external_data=external_data,)

You can see that the core function that completes the transformation of the ONNX model is this flow.onnx.export function. Let's continue to jump to this function https://github.com/Oneflow-Inc/oneflow_convert_tools/blob/main/oneflow_onnx/oneflow2onnx/flow2onnx.py#L229-L281. The code is as follows:

Def Export (job_func: Callable, model_save_dir: Text, onnx_filename: Text, continue_on_error: bool = False, opset: Optional [int] = None, extra_opset: Optional [int] = None, shape_override: Optional [Dict [Text, List [int] = None, external_data: bool = False,): r "" Export an oneflow model into ONNX format. Job function model_save_dir of Args: job_func: OneFlow: the folder containing the model weights defined by OneFlow. This model weight is saved using the check_point.save interface of oneflow. Onnx_filename: output ONNX model file name, string type continue_on_error: if an OP cannot handle (that is, no mapping), whether to continue with opset: ONNX Opset version number, default is 10 extra_opset: list of extra Opset, such as Opset shape_override used by custom action: dictionary with input information Override the input shape given by OneFlow external_data: save weights as ONNX external data, usually to bypass protobuf's 2GB file size limit. "" assert os.getenv ("ENABLE_USER_OP")! = "False" # make sure that the path to the model exists assert os.path.isdir (model_save_dir) # get all the current job job_set = c_api_util.GetJobSet () through the c_api_util.GetJobSet () method # the model we are going to transfer is defined in job_func So let's first record its name job_name = job_func.__name__ # compile job_set, find the job for job in job_set.job that defines the model: # TODO (OYY) Modify the interface before modifying it if job.job_conf.job_name = = job_name: # job found, you can start with the following steps We explain onnx_graph = ProcessFlowGraph (job, model_save_dir, continue_on_error=continue_on_error, opset=opset, extra_opset=extra_opset, shape_override=shape_override) in detail outside. ) onnx_graph = optimizer.OptimizeGraph (onnx_graph) model_proto = onnx_graph.MakeModel (job_name, onnx_filename, external_data=external_data) with open (onnx_filename "wb") as f: try: f.write (model_proto.SerializeToString ()) except ValueError as e: raise ValueError ("Error occured when running model_proto.SerializeToString () If the model is larger than 2GB, please specify external_data=True when calling flow.onnx.export. Original error message:\ n {} ".format (e)) return raise ValueError ('Cannot find job" {} "in jobset'.format (job_name))

You can see that this function first compiles the job_set in OneFlow, then finds the job where we first defined the AlexNet model, and then enters the ProcessFlowGraph function, which mainly does three things and finally obtains the initial version of the legal ONNX model (the initial version means that it has not been optimized and filled with the weight of the ONNX node), we follow this function, the code is as follows.

Def ProcessFlowGraph (flow_graph, model_save_dir, continue_on_error=False, opset=None, extra_opset=None, shape_override=None,): # this function is used to get the Opset Version of the exported ONNX In OneFlow, the maximum is 10 opset = util.FindOpset (opset) logger.info ("Using opset", opset) # to determine whether the current version of ONNX supports the above Opset Version if opset > schemas.get_max_supported_opset_version (): logger.warning ("Currently installed onnx package% s is too low to support opset% s,"please upgrade onnx package to avoid potential conversion issue." Util.get_onnx_version (), opset,) if shape_override is None: shape_override = {} # is used to convert each node of oneflow to the format of onnx node Leaving the op type, input and output, and attribute values unchanged, what this step produces is not the legal onnx model (onnx_nodes, op_cnt, attr_cnt, dtypes, output_shapes,) = FlowToOnnxNaive (flow_graph, shape_override) # to construct a Graph class For subsequent convenient modification of onnx network structure g = Graph (onnx_nodes, model_save_dir, output_shapes, dtypes, opset, extra_opset,) # create ops mapping for the desired opsets ops_mapping = handler.flow_op.CreateMapping (g.opset, g.extra_opset) # some nodes may already copied into inner Graph, so remove them from main Graph. The TopologicalSort (g, continue_on_error) # FlowOnnxMapping function calls each conversion function (registered through @ flow_op) to transform op one by one The legal onnx model mapped_op, unmapped_op, exceptions = FlowOnnxMapping (g, ops_mapping) if unmapped_op: logger.error ("Unsupported ops:% s", unmapped_op) if exceptions and not continue_on_error: raise exceptions [0] # onnx requires topological sorting TopologicalSort (g Continue_on_error) g.UpdateProto () logger.debug ("Summay Stats:\ n"\ toneflow ops: {}\ n"\ toneflow attr: {}\ n"\ tonnx mapped: {}\ n"\ tonnx unmapped: {}" .format (op_cnt, attr_cnt, mapped_op, unmapped_op)) return g

FlowToOnnxNaive this function is used to convert each node of oneflow to onnx node format, keeping the op type, input and output, and attribute values unchanged, and finally return all the converted ONNX nodes (these ONNX nodes are not really legal ONNX nodes in this place, but not legal ONNX nodes until one-to-one conversion is performed later). Next, these ONNX nodes are used to construct Graph classes to facilitate subsequent modifications to the ONNX model. The Graph class is implemented in https://github.com/Oneflow-Inc/oneflow_convert_tools/blob/18e041d92654cfc8b03e16c906c451a405c99fd2/oneflow_onnx/onnx_wrapper.py, which mainly defines the wrapper of onnx graph and node, and contains a variety of api that modify the structure of the onnx diagram, where the relevant code of the tensorflow-onnx project is reused. Note that the ONNX model has not been constructed after the construction of the Graph class, because the OP of OneFlow has not been converted into an one-to-one OP of ONNX.

Next, we call the function handler.flow_op.CreateMapping (g.opset, g.extra_opset), which is implemented as follows:

Def CreateMapping (max_onnx_opset_version, extra_opsets): "Create the final mapping dictionary by stacking domains and opset versions.: param max_onnx_opset_version: The highest onnx opset the resulting graph may use.: param extra_opsets: Extra opsets the resulting graph may use." Mapping = {constants.ONNX_DOMAIN: max_onnx_opset_version} if extra_opsets: for extra_opset in extra_opsets: mapping [questions _ opset.domain] = extra_opset.version ops_mapping = {} for domain, opsets in flow_op.get_opsets () .items (): for target_opset Op_map in enumerate (opsets): print ('='* 100) print (target_opset) print (op_map) m = mapping.get (domain) if m: if target_opset nextarc) {v = p-> adjvex If (! (--indegree [v]) Push (S, v);}} if (count

< G.vexnum) return false; else return true;} 上面加粗的这句话即是拓扑排序的核心。一般深度学习模型也是一个DAG（有向无环图），我们这里同样使用了拓扑排序算法使得我们在一对一转换OP时和真实的网络结构是完全一致的。另外考虑到这里可能插入了一些新的节点如Identity可能会破坏原Graph的拓扑序，以及时刻需要判断计算图是否是一个完整合法的DAG，使用拓扑排序都是没有坏处的。完成拓扑排序之后我们就可以执行FlowOnnxMapping完成OneFlow OP和ONNX OP的一对一转换了，代码如下： def FlowOnnxMapping(g, ops_mapping): logger.debug("Mapping Oneflow node to ONNX node(s)") mapped_op = collections.Counter() unmapped_op = collections.Counter() exceptions = [] ops = list(g.get_nodes()) for node in ops: logger.debug("Process node: %s\n%s", node.name, node.summary) if node.skip_conversion: logger.debug("explicitly skip node " + node.name) continue op = node.op_type map_info = ops_mapping.get(op) if map_info is None: unmapped_op[op] += 1 logger.error("oneflow op [%s: %s] is not supported", node.name, op) continue mapped_op[op] += 1 func, onnx_op, kwargs = map_info if onnx_op is not None: node.op_type = onnx_op try: func(g, node, **kwargs) node.skip_conversion = True except Exception as ex: logger.error( "Failed to convert node %s\n%s", node.name, node.summary, exc_info=1 ) exceptions.append(ex) return mapped_op, unmapped_op, exceptions 执行完这个函数会返回map上的OP容器，以及没有map上的OP容器，当然如果Graph中有OP没有map上也就是转换失败会抛出错误信息给用户。在转换完成之后，我们调用Graph中的每个Node的UpdateProto()构造函数将之前的假ONNX节点信息更新成真的ONNX节点信息。接下来，我们调用各种 optimizer 优化网络结构，例如尽可能消除 nhwc->

The transpose op (optimizer.OptimizeGraph within the Export function) brought by nchw, that is, https://github.com/Oneflow-Inc/oneflow_convert_tools/blob/main/oneflow_onnx/oneflow2onnx/flow2onnx.py#L264. There are mainly the following kinds of optimizer in oneflow-onnx:

Optimizer in oneflow-onnx to get a better ONNX model

These optimizer are inherited from tensorflow-onnx, and we will replace some of them with onnx's native optimizer.

After optimizing the ONNX model, call the following function to take the oneflow weight saved on disk, assign it to the onnx model object, and return the onnx model object in protobuf format. This completes the creation of a legitimate ONNX model.

Model_proto = onnx_graph.MakeModel (job_name, onnx_filename, external_data=external_data)

Our X2OneFlow is divided into two steps: X2ONNX and ONNX2Oneflow, in which ONNX2OneFlow and OneFlow2ONNX share a set of basic code, so the only thing that needs to be modified is to change the direction of the decorator that registers OP conversion in handles.

For more details, take a look at our source code https://github.com/Oneflow-Inc/oneflow_convert_tools/tree/main/oneflow_onnx.

At this point, I believe you have a deeper understanding of "how OneFlow interacts with ONNX". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.