Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to deploy a machine learning model using PMML

2025-04-10 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

How to use PMML to deploy machine learning model, in view of this problem, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.

Introduction to PMML

Predictive Model markup language PMML (Predictive Model Markup Language) is a set of model representation language independent of platform and environment, and it is the actual standard for representing machine learning models. From PMML1.1 released in 2001 to the latest 4.4 PMML standard in 2019, the number of models has been expanded from 6 models to 17 models, and mining models (Mining Model) are provided to combine multiple models.

As an open and mature standard, PMML is developed and maintained by the data mining organization DMG (Data Mining Group). After more than ten years of development, it has been widely used. More than 30 vendors and open source projects (including mainstream vendors such as SAS,IBM SPSS,KNIME,RapidMiner) support and apply PMML in their data mining and analysis products. The application details of these vendors are shown in the following table: PMML Powered

Introduction of PMML Standard

PMML is a set of XML-based standards that defines the elements and attributes to be used through XML Schema and consists of the following core components:

A data dictionary (Data Dictionary) that describes input data.

Data transformations (Transformation Dictionary and Local Transformations) are applied to generate new derived fields on input data fields.

Model definition (Model), each model type has its own definition.

Output (Output), specifying the output of the model.

The PMML prediction process conforms to the data mining analysis process:

Advantages of PMML

Platform independence. PMML can separate the model deployment environment from the development environment and achieve cross-platform deployment, which is the biggest advantage that PMML distinguishes from other model deployment methods. For example, a model built using Python can be deployed in a Java production environment after exporting PMML.

Interoperability. This is the biggest advantage of the standard protocol, which enables PMML-compatible predictors to read standard PMML models derived from other applications.

Widely supportive. It has been supported by more than 30 vendors and open source projects. Through many existing open source libraries, many heavyweight and popular open source data mining models can be converted into PMML.

Readability. The PMML model is a XML-based text file that can be opened and viewed using any text editor, which is more secure than binary serialization files.

PMML open source class library model conversion library to generate PMML:

Python model:

Nyoka, which supports Scikit-Learn,LightGBM,XGBoost,Statsmodels and Keras. Https://github.com/nyoka-pmml/nyoka

JPMML series, such as JPMML-SkLearn, JPMML-XGBoost, JPMML-LightGBM, etc., provide command-line programs to export models to PMML. Https://github.com/jpmml

R model:

R pmml package: https://cran.r-project.org/web/packages/pmml/index.html

R2pmml: https://github.com/jpmml/r2pmml

JPMML-R: provides command line programs to export R models to PMML, https://github.com/jpmml/jpmml-r

Spark:

Spark mllib, but only the model itself, does not support Pipelines, and is not recommended.

JPMML-SparkML, which supports SparkML pipleines. Https://github.com/jpmml/jpmml-sparkml

Model evaluation library, read PMML:

Java:

JPMML-Evaluator, a pure Java PMML prediction library, the open source protocol is AGPL V3. Https://github.com/jpmml/jpmml-evaluator

PMML4S, developed with Scala, provides both Scala and Java API, the interface is simple and easy to use, and open source protocol is a commonly used loose protocol Apache 2. Https://github.com/autodeployai/pmml4s

Python:

PyPMML,PMML 's Python prediction library, PyPMML is the Python interface wrapped by PMML4S. Https://github.com/autodeployai/pypmml

Spark:

JPMML-Evaluator-Spark, https://github.com/jpmml/jpmml-evaluator-spark

PMML4S-Spark, https://github.com/autodeployai/pmml4s-spark

PySpark:

PMML model is predicted in PyPMML-Spark,PySpark. Https://github.com/autodeployai/pypmml-spark

REST API:

AI-Serving, which provides both REST API and gRPC API for the PMML model, and open source protocol Apache 2. Https://github.com/autodeployai/ai-serving

Openscoring, which provides REST API, the open source protocol AGPL V3. Https://github.com/openscoring/openscoring

PMML demo

Build the model, complete Jupyter Notebook, please refer to: xgb-iris-pmml.ipynb

Build a XGBoost model using Iris data, standardize floating-point data before modeling, and take advantage of Pipeline in Scikit-learn:

From sklearn import datasetsfrom sklearn.model_selection import train_test_splitfrom sklearn.pipeline import Pipelinefrom sklearn.preprocessing import StandardScalerimport pandas as pdfrom xgboost import XGBClassifierseed = 123456iris = datasets.load_iris () target = 'Species'features = iris.feature_namesiris_df = pd.DataFrame (iris.data, columns=features) iris_ DF [target] = iris.targetX, y = iris_df [features], iris_ DF [target] X_train, X_test, y_train, y_test = train_test_split (X, y, test_size=0.33) Random_state=seed) pipeline = Pipeline ([('scaling', StandardScaler ()), (' xgb', XGBClassifier (n_estimators=5, seed=seed)]) pipeline.fit (X_train, y_train) y_pred = pipeline.predict (X_test) y_pred_proba = pipeline.predict_proba (X_test)

Using Nyoka, export Pipeline to PMML:

From nyoka import xgboost_to_pmmlxgboost_to_pmml (pipeline, features, target, "xgb-iris.pmml")

Use PyPMML to verify that PMML predictions are consistent with the native Python model:

From pypmml import Modelmodel = Model.load ("xgb-iris.pmml") model.predict (X_test)

Read PMML and make predictions. The following uses PMML4S's Scala interface, and you can also use its Java interface, which is very easy to use. Complete program, in the following Zeppelin Notebook: https://github.com/aipredict/ai-deployment/blob/master/deploy-ml-using-pmml/pmml4s-demo.json

Because Github does not support browsing Zeppelin Notebook, you can browse at the following address: https://www.zepl.com/viewer/github/aipredict/ai-deployment/master/deploy-ml-using-pmml/pmml4s-demo.json

Import org.pmml4s.model.Modelval model = Model.fromFile ("xgb-iris.pmml") val result = model.predict (Map ("sepal length (cm)"-> 5.7," sepal width (cm) "- > 4.4," petal length (cm)"-> 1.5, "petal width (cm)"-> 0.4)) PMML shortcomings

Although PMML has many advantages, it is not without disadvantages, such as:

Can not support all data pre-processing and post-processing operations. Although PMML has supported almost all standard data processing methods, it is still lack of effective support for some custom actions of users, so it is difficult to put them into PMML.

Support for model types is limited. In particular, there is a lack of support for deep learning models. PMML will add support for depth models in the next version 5.0. at present, Nyoka can support depth models such as Keras, but the extended PMML model is generated.

PMML is a loose specification standard, some vendors may generate PMML that does not quite conform to the standard definition of Schema, and the PMML specification allows manufacturers to add their own extensions, which pose some obstacles to the use of these models.

This is the answer to the question on how to use PMML to deploy the machine learning model. I hope the above content can be of some help to you. If you still have a lot of questions to solve, you can follow the industry information channel for more related knowledge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report