How to deploy generic Cloud Native Model applications based on OAM and kfserving 07/06 Update SLTechnology News&Howtos

How to deploy generic Cloud Native Model applications based on OAM and kfserving

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article shows you how to deploy generic cloud native model applications based on OAM and kfserving. The content is concise and easy to understand, which will definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.

Kfserving is a kubeflow serverless component used to build and deploy a standardized algorithm model, but it is deeply bound to knative and hides the transmission link, such as encapsulating istio. Such a complex structure is not conducive to the direct use of the production environment. Here, the OAM implemented through kubevela re-encapsulates the serverless process to achieve a simple algorithm model serverless.

Background

How to provide efficient engineering cloud support for the algorithm team is a very important and meaningful topic in the cloud native era. Now the open source community should be relatively perfect is Kubeflow, a collection of ML experimental deployment environment tools, but on the whole, it is relatively bulky and not suitable for small team production environment to land quickly. Here is an example of implementing an algorithm standardization model based on kubevela and kfserving for reference.

Project introduction

Project address: https://github.com/shikanon/vela-example/tree/main/example/sklearnserver

Three objects, mpserver, hpa, and httproute, are provided through kubevela.

Mpserver is mainly responsible for generating deployment and service resources, and is the main body of the program.

Httroute is mainly responsible for generating exposed ports and accessing url

Hpa mainly ensures the scalability of services.

Pre-deployment preparation

Because you are using vela, you need to download the vela client first

Create a sklearn service

The case is placed under exmaple/sklearnserver.

The local image compiles and runs:

# compile docker build-t swr.cn-north-4.myhuaweicloud.com/hw-zt-k8s-images/sklearnserver:demo-iris-f sklearn.Dockerfile.

Upload to Huawei Cloud Mirror Warehouse

Docker login swr.cn-north-4.myhuaweicloud.comdocker push swr.cn-north-4.myhuaweicloud.com/hw-zt-k8s-images/sklearnserver:demo-iris

Create an application file for demo-iris-01.yaml

Name: demo-iris-01services: demo-iris: type: mpserver image: swr.cn-north-4.myhuaweicloud.com/hw-zt-k8s-images/sklearnserver:demo-iris ports: [8080] cpu: "200m" memory: "250Mi" httproute: gateways: ["external-gateway"] hosts: ["demo-iris.rcmd.testing.mpengine"] servernamespace: rcmd serverport: 8080 hpa: Min: 1 max: 1 cpuPercent: 60

Because the rcmd namespace is used here, you need to switch when creating it. You can create an environment of rcmd namespace through the visual interface through vela dashboard:

Vela dashboard

After success, you can view it through vela env:

$vela env lsNAME CURRENT NAMESPACE EMAIL DOMAINdefault defaultrcmd * rcmd

Run applications in the cloud native environment

$vela up-f demo-iris-01.yamlParsing vela appfile... Load Template... Rendering configs for service (demo-iris)... Writing deploy config to (.vela / deploy.yaml) Applying application... Checking if app has been deployed...App has not been deployed, creating a new deployment... ✅ App has been deployed? Port forward: vela port-forward demo-iris-01 SSH: vela exec demo-iris-01 Logging: vela logs demo-iris-01 App status: vela status demo-iris-01 Service status: vela status demo-iris-01-- svc demo-iris test

After deployment, you can test:

$curl-I-d'{"instances": [[5.1,3.5,1.4,0.2]]}'- H "Content-Type: application/json"-X POST demo-iris.rcmd.testing.mpengine:8000/v1/models/model:predict {"predictions": [0]}

Implementation description kfserver development algorithm server

Kfserver provides server for a variety of commonly used frameworks, such as server framework for sklearn, lgb, xgb, pytorch and other services. Kfserver is developed based on the tornado framework, which provides abstract interfaces such as model loading, interface health detection, prediction and reference interpretation. For more information, please see kfserving/kfserving/kfserver.py:

... def create_application (self): return tornado.web.Application ([# Server Liveness API returns 200 if server is alive. (r "/", LivenessHandler), (r "/ v2/health/live", LivenessHandler), (r "/ v1/models", ListHandler, dict (models=self.registered_models)), (r "/ v2/models", ListHandler, dict (models=self.registered_models)), # Model Health API returns 200 if model is ready to serve. (r "/ v1/models/ ([a-zA-Z0-9 cycles -] +)", HealthHandler, dict (models=self.registered_models)), (r "/ v2/models/ ([a-zA-Z0-9 cycles -] +) / status", HealthHandler, dict (models=self.registered_models)), (r "/ v1/models/ ([a-zA-Z0-9 cycles -] +): predict", PredictHandler Dict (models=self.registered_models)), (r "/ v2/models/ ([a-zA-Z0-9 cycles -] +) / infer", PredictHandler, dict (models=self.registered_models)), (r "/ v1/models/ ([a-zA-Z0-9 cycles -] +): explain", ExplainHandler, dict (models=self.registered_models)) (r "/ v2/models/ ([a-zA-Z0-9 cycles -] +) / explain", ExplainHandler, dict (models=self.registered_models)), (r "/ v2/repository/models/ ([a-zA-Z0-9 cycles -] +) / load", LoadHandler, dict (models=self.registered_models)), (r "/ v2/repository/models/ ([a-zA-Z0-9 cycles -] +) / unload" UnloadHandler, dict (models=self.registered_models),]).

The sklearn server case we use here mainly implements the predict interface:

Import kfservingimport joblibimport numpy as npimport osfrom typing import DictMODEL_BASENAME = "model" MODEL_EXTENSIONS = [".jrubb", ".pkl", ".pickle"] class SKLearnModel (kfserving.KFModel): # pylint:disable=c-extension-no-member def _ _ init__ (self, name: str Model_dir: str): super (). _ init__ (name) self.name = name self.model_dir = model_dir self.ready = False def load (self)-> bool: model_path = kfserving.Storage.download (self.model_dir) paths = [os.path.join (model_path) MODEL_BASENAME + model_extension) for model_extension in MODEL_EXTENSIONS] for path in paths: if os.path.exists (path): self._model = joblib.load (path) self.ready = True break return self.ready def predict (self) Request: Dict)-> Dict: instances = request ["instances"] try: inputs = np.array (instances) except Exception as e: raise Exception ("Failed to initialize NumPy array from inputs:% s,% s"% (e) Instances) try: result = self._model.predict (inputs). Tolist () return {"predictions": result} except Exception as e: raise Exception ("Failed to predict% s"% e)

The above content is based on how to deploy generic cloud native model applications based on OAM and kfserving. Have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.