How to use TFserving 07/11 Update SLTechnology News&Howtos

How to use TFserving

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly explains "how to use TFserving". The content in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn how to use TFserving.

1. What is TFserving?

When you train your model and need to provide it for external use, you need to deploy the model online and provide an appropriate interface for external calls. You may consider some questions:

What is used to deploy

How to provide api interface

How to allocate GPU resources for multiple models

How to update the online model without interruption of service

At present, the popular deep learning frameworks Tensorflow and Pytorch, Pytorch officially does not provide a suitable online deployment plan; Tensorflow provides a TFserving scheme to deploy online model reasoning. In addition, Model Server for Apache MXNet provides reasoning services for the MXNet model.

This article is a guide to using TFServing. If you are a pytorch or MXNet model, you can also convert it into a TFserving model through ONNX and deploy it on TFServing.

Then what is TFserving?

TFserving is an online reasoning service launched by Google 2017. Using the Cmax S architecture, the client can communicate with the model service through gRPC and RESTfull API.

Characteristics of TFServing:

Support for model versioning and rollback: Manager manages the version of the model

Support concurrency for high throughput

Right out of the box and customizable

Support for multi-model services

Support for batch processing

Support hot update: Source loads the local model, notifies Manager that a new model needs to be loaded, Manager checks the version of the model, and notifies the Loader created by Source to load the model

Support for distributed model

2.TFserving installation

It is strongly recommended to install TFserving in docker mode, depending on docker and nvidia-docker (gpu for TFserving)

Docker installation

# install yum-utils tools and device-mapper-related dependency package yum install-y yum-utils\ device-mapper-persistent-data\ lvm2 # add docker-ce stable version of repository yum-config-manager\-- add-repo\ https://download.docker.com/linux/centos/docker-ce.repo # update yum cache file yum makecache fast # View all installable docker-ce versions yum list docker-ce-- Showduplicates | sort-r # install docker-ce yum install docker-ce-17.12.1.ce-1.el7.centos # allow boot to start docker-ce service systemctl enable docker.service # start Docker-ce service systemctl start docker # run test container hello-world docker run-- rm hello-world

Nvidia-docker installation

# install nvidia-docker2 yum install-y nvidia-docker2-2.0.3-1.docker17.12.1.ce # restart the docker service service docker restart

Install TFserving

Docker pull tensorflow/serving:latest-gpu # can choose other versions such as docker pull tensorflow/serving:1.14.0-rc0-gpu

Note: docker version and nvidia-docker should match

For the latest nvidia-docker, Docker is 19.03.Please refer to the official https://github.com/NVIDIA/nvidia-docker.

Nvidia-docker2 supports other versions with Docker version lower than 19.03 (need > = 1.12). The existing server has 18.09, 1.17, 1.13 https://github.com/NVIDIA/nvidia-docker/wiki/Installation-(version-2.0).

Instructions for using 3.TFserving

3.1 Model transformation

The model of TFserving needs to be converted to TFserving format, which does not support the usual checkpoint and pb formats.

TFserving's model contains a .pb file and a variables directory (which can be empty), and the export format is as follows:.

├── 1 │ ├── saved_model.pb │ └── variables ├── 2 │ ├── saved_model.pb │ └── variables

The transformation paths of different deep learning frameworks:

(1) pytorch (.pth)-- > onnx (.onnx)-- > tensorflow (.pb)-- > TFserving (2) keras (.h6)-- > tensorflow (.pb)-- > TFserving (3) tensorflow (.pb)-- > TFserving

Here is a detailed description of the conversion from pb to TFserving model.

Import tensorflow as tf def create_graph (pb_file): "" Creates a graph from saved GraphDef file and returns a saver. "" # Creates graph from saved graph_def.pb. With tf.gfile.FastGFile (pb_file, 'rb') as f: graph_def = tf.GraphDef () graph_def.ParseFromString (f.read ()) _ = tf.import_graph_def (graph_def, name='') def pb_to_tfserving (pb_file, export_path, pb_io_name= [], input_node_name='input', output_node_name='output' Signature_name='default_tfserving'): # pb_io_name is the name of the node input and output of the pb model # input_node_name is converted input name # output_node_name is converted output name # signature_name is signed create_graph (pb_file) # tensor_name_list = [tensor.name for tensor in tf.get_default_graph (). As_graph_def () .node] input_name ='% slug 0'% pb_io_name [0] output_name = '% svv 0'% pb_io_name [1] with tf.Session () as sess: in_tensor = sess.graph.get_tensor_by_name (input_name) out_tensor = sess.graph.get_tensor_by_name (output_name) builder = tf.saved_model.builder.SavedModelBuilder (export_path) # # export_path Export path inputs = {input_node _ name: tf.saved_model.utils.build_tensor_info (in_tensor)} outputs = {output_node_name: tf.saved_model.utils.build_tensor_info (out_tensor)} signature = tf.saved_model.signature_def_utils.build_signature_def (inputs) Outputs, method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME) builder.add_meta_graph_and_variables (sesssess=sess, tags= [tf.signature _ model.tag_constants.SERVING], signature_def_map= {signature_name: signature}, clear_devices=True) # # signature_name is the signature Customizable builder.save () pb_model_path = 'test.pb' pb_to_tfserving (pb_model_path,'. / 1), pb_io_name= ['input_1_1','output_1'], signature_name='your_model')

3.2 TFserving configuration and startup

After the model is exported, different versions (post-version numbers) of the same model can be exported, and the model and version can be specified in the TFserving configuration. TFserving's model is uniquely located by the model name and signature. TFserving can be configured with multiple models to make full use of GPU resources.

Model configuration

# models.config model_config_list {config {name: 'your_model' base_path:' / models/your_model/' model_platform: 'tensorflow' # model_version_policy {# specific {# versions: 42 # versions: 43 #} #} # version_labels {# key:' stable' # value: 43 #} # version_labels {# key: 'canary' # value: 43 #}} config {name: "mnist" Base_path: "/ models/mnist", model_platform: "tensorflow", model_version_policy: {specific: {versions: 1, versions: 2} # version control can be done through model_version_policy

Start the service

# it is recommended to put the model and configuration files in a local path outside of docker, such as / home/tfserving/models Mount to the inside of docker through-v #-- model_config_file: specify model profile #-e NVIDIA_VISIBLE_DEVICES=0: specify GPU #-p specified port mapping 8500 is gRpc 8501 is restful api port #-t is docker image nvidia-docker run-it-- privileged-d-e NVIDIA_VISIBLE_DEVICES=0-v / home/tfserving/models:/models-p 8500 GPU-p 8501 GPU 8501\ -t tensorflow/serving:latest-gpu\-- model_config_file=/models/models.config # / home/tfserving/models structure ├── models.config └── your_model ├── 1 │ ├── saved_model.pb │ └── variables └── 2 ├── saved_model.pb └── variables # test curl http://192 .168.0.3: 8501/v1/models/your_model {"model_version_status": [{"version": "2" "state": "AVAILABLE", "status": {"error_code": "OK", "error_message": "}}]} # other startup methods # if multiple models are in different directories You can load nvidia-docker run-it-- privileged-d-e NVIDIA_VISIBLE_DEVICES=0\-- mount type=bind,source=/home/tfserving/models/your_model,target=/models/your_model\-- mount type=bind,source=/home/tfserving/models/your_model/models.config,target=/models/models.config\-p 8510 purl 8500-p 8501 purl 8501\-t tensorflow/serving:latest-gpu\-- model_config_file=/models/models.config separately via-mount

3.3 TFserving service invocation

The client can call the TFserving service model through gRpc and http, which supports multiple client languages. Here, the method of calling python is provided. The invocation uniquely corresponds to a model through the model name and signature.

GRpc call, the port of gRpc is 8500

# #-*-coding:utf-8-*-import tensorflow as tf from tensorflow_serving.apis import predict_pb2 from tensorflow_serving.apis import prediction_service_pb2_grpc import grpc import time import numpy as np import cv2 class YourModel (object): def _ _ init__ (self, socket): "Args: socket: host and port of the tfserving Like 192.168.0.3 self.request 8500 "" self.socket = socket start = time.time () self.request, selfself.stub = self.__get_request () end = time.time () print ('initialize cost time:' + str (end-start) +'s') def _ get_request (self): channel = grpc.insecure_channel (self.socket) Options= [('grpc.max_send_message_length', 1024 * 1024 * 1024), (' grpc.max_receive_message_length' 1024 * 1024 * 1024)) # adjustable size stub = prediction_service_pb2_grpc.PredictionServiceStub (channel) request = predict_pb2.PredictRequest () request.model_spec.name = "your_model" # model name request.model_spec.signature_name = "your_model" # model signature name return request, stub def run (self) Image): "" Args: image: the input image (rgb format) Returns: embedding is output of model "" img = image [...,::-1] self.request.inputs ['input'] .CopyFrom (tf.contrib.util.make_tensor_proto (img)) # images is input of model result = self.stub.Predict (self.request) 30.0) return tf.make_ndarray (result.outputs ['output']) def run_file (self, image_file): "Args: image_file: the input image file Returns:" image = cv2.imread (image_file) image = cv2.cvtColor (image) Cv2.COLOR_BGR2RGB) return self.run (image) if _ _ name__ = ='_ _ main__': model = YourModel ('192.168.0.3 virtual 8500') test_file ='. / test.jpg' result = model.run_file (test_file) print (result) # [8.014745e-05 9.999199e-01]

Restful api call: restful port is 8501

Import cv2 import requests class SelfEncoder (json.JSONEncoder): def default (self, obj): if isinstance (obj, np.ndarray): return obj.tolist () elif isinstance (obj, np.floating): return float (obj) elif isinstance (obj, bytes): return str (obj, encoding='utf-8') Return json.JSONEncoder.default (self, obj) image_file ='/ home/tfserving/test.jpg' image = cv2.imread (image_file) image = cv2.cvtColor (image, cv2.COLOR_BGR2RGB) img = image [...,::-1] input_data = {"signature_name": "your_model", "instances": img} data = json.dumps (input_data, cls=SelfEncoder) Indent=None) result = requests.post ("http://192.168.0.3:8501/v1/models/your_model:predict", datadata=data) eval (result. Content) # {'predictions': [8.01474525e-05, 0.999919891]} Thank you for your reading The above is the content of "how to use TFserving". After the study of this article, I believe you have a deeper understanding of how to use TFserving, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.