In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly explains "how to design Python interface for machine learning engineering". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let Xiaobian take you to learn "how to design Python interface for machine learning engineering"!
1. Predictor is just a Python class
At the heart of Cortex is our predictor, which is essentially a prediction API that includes all the request processing code and dependencies. The predictor interface enforces some simple requirements for these prediction APIs.
Because Cortex takes a microservice approach to model servicing, the predictor interface focuses strictly on two things:
initialization model
provide predictive
In this spirit, Cortex's prediction interface requires two functions, namely the remaining init__() and predict(), which do more or less what you expect:
import torchfrom transformers import pipeline
class PythonPredictor: def __init__(self, config): # Use GPUs, if available device = 0 if torch.cuda.is_available() else -1 # Initialize model self.summarizer = pipeline(task="summarization", device=device)
def predict(self, payload): # Generate prediction summary = self.summarizer( payload["text"], num_beams=4, length_penalty=2.0, max_length=142, no_repeat_ngram_size=3 ) # Return prediction return summary[0]["summary_text"]
After initialization, you can think of a predictor as a Python object whose single predict() function is invoked when the user queries the endpoint.
One of the biggest benefits of this approach is that it is intuitive to anyone with software engineering experience. No need to touch data pipes or model training code. A model is just a file, and a predictor is just an object that imports the model and runs the predict() method.
However, beyond its grammatical appeal, this approach offers some key benefits, namely how it complements the broader approach of the cortex.
2. Prediction is just an HTTP request
One of the complexities of building an interface to provide predictive services in production is that the input will almost certainly differ from the training data for the model, at least in format.
This works on two levels:
The body of the POST request is not a NumPy array, nor is it any data structure that your model uses to process.
Machine learning engineering is all about using models to build software, which usually means using models to process data they haven't been trained on, such as writing folk music using GPT-2.
Therefore, the predictor interface cannot be opinionated about the inputs and outputs of the prediction API. Prediction is just an HTTP request, and developers can handle it at will. For example, if they want to deploy a multi-model endpoint and query different models based on request parameters, they can do this:
import torchfrom transformers import pipelinefrom starlette.responses import JSONResponse
class PythonPredictor: def __init__(self, config): self.analyzer = pipeline(task="sentiment-analysis") self.summarizer = pipeline(task="summarization")
def predict(self, query_params, payload): model_name = query_params.get("model") if model_name == "sentiment": return self.analyzer(payload["text"])[0] elif model_name == "summarizer": summary = self.summarizer(payload["text"])[0] else: return JSONResponse({"error": f"unknown model: {model_name}"}, status_code=400)
While this interface gives developers the freedom to do what they want with their API, it also provides some natural scope to make cortex more opinionated in terms of infrastructure.
For example, in the background Cortex uses FastAPI to set up request routing. Cortex sets up a number of processes at this layer related to automatic sequencing, monitoring, and other infrastructure features that can become very complex if developers need to implement routing.
However, because each API has a predict() method, each API has the same number of routes-1. Suppose this allows Cortex to do more at the infrastructure level without constraining engineers.
3. The service model is just a microservice
For people who use machine learning in production, scale is a major issue. The model can be large (GPT-2 is about 6 GB), computationally expensive, and can have high latency. Especially for real-time inference, scaling up to handle traffic can be a challenge-especially if you have a limited budget.
To solve this problem, Cortex treats predictors as microservices that can scale horizontally. More specifically, when developers deploy Cortex, Cortex will contain APIs, rotate clusters prepared for inference, and deploy. It then exposes the API as a web service behind the load balancer and configures auto-scaling, updating, and monitoring:
The predictor interface is the basis for this process, although it is "just" a Python interface.
What the predictor interface does is force the code to pack into a single atomic unit of inference. All of the request processing code required for a single API is contained in a single predictor. This makes it easy for the cerebral cortex to weigh predictors.
In this way, engineers don't have to do any extra work-unless, of course, they want to make some adjustments-to prepare an API for production. A cortical deployment is production ready by default.
At this point, I believe everyone has a deeper understanding of "how to design Python interfaces for machine learning engineering". Let's do it in practice! Here is the website, more related content can enter the relevant channels for inquiry, pay attention to us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.