How to deploy Monitoring and extending Machine Learning Model on AWS 07/02 Update SLTechnology News&Howtos

How to deploy Monitoring and extending Machine Learning Model on AWS

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article shows you how to deploy a monitoring and expanding machine learning model on AWS. The content is concise and easy to understand, and it will definitely make your eyes shine. I hope you can get something through the detailed introduction of this article.

Deploying robust and scalable machine learning solutions is still a very complex process that requires a lot of human participation and a lot of effort. As a result, new products and services take a long time to come to market, or are abandoned in a prototype state, reducing interest in the industry. So how can we facilitate the process of putting machine learning models into production?

Cortex is an open source platform that deploys machine learning models as production network services. It leverages a powerful AWS ecosystem to deploy, monitor, and extend framework-independent models as needed. Its main features are summarized as follows:

Framework-independent: Cortex supports any python code; like other python scripts, TensorFlow, PyTorch, scikit-learn, and XGBoost are supported by this library.

Auto Zoom: Cortex automatically scales your api to handle production load.

CPU / GPU support: using AWS IaaS as the underlying infrastructure, Cortex can run in CPU or GPU environments.

Spot instances: Cortex supports EC2 Spot instances to reduce costs.

Rolling updates: Cortex applies any updates to the model without any downtime.

Log streams: Cortex uses docker-like syntax to save logs in the deployment model and stream them to CLI.

Prediction and monitoring: Cortex monitors network indicators and tracks and forecasts.

Minimum configuration: the Cortex deployment configuration is defined as a simple YAML file.

In this article, we use Cortex to deploy an image classification model to AWS as a web service. So, to get to the point, let's introduce Cortex.

Deploy the model as a Web service

In this example, we use the fast.ai library (https://pypi.org/project/fastai/) and borrow the pets classification model from the first course of the related MOOC (https://course.fast.ai/). The following sections describe the installation of Cortex and the deployment of the pets classification model as a web service.

Installation

If it is not already installed, you should first create a new user account on AWS with programmatic access. To do this, select the IAM service, then select Users from the right panel, and then press the Add User button. Specify a name for the user and select Programmatic access.

Next, in the Permissions screen, select the Attach existing policies directly tab, and then select AdministratorAccess.

You can leave the markup page blank, view and create users. Finally, pay attention to the access key ID and key access key.

On the AWS console, you can also create an S3 bucket to store any other artifacts that the trained model and code may generate. You can name the bucket whatever you want, as long as it is a unique name. Here, we create a bucket called cortex-pets-model.

Next, we must install Cortex CLI on the system and start the Kubernetes cluster. To install Cortex CLI, run the following command:

Bash-c "$(curl-sS https://raw.githubusercontent.com/cortexlabs/cortex/0.14/get-cli.sh)"

Check that you are installing the latest version of Cortex CLI by visiting the appropriate documentation section (https://www.cortex.dev/)).

We are now ready to set up a cluster. It is easy to create a Kubernetes cluster using Cortex. Simply execute the following command:

Cortex cluster up

Cortex will ask you to provide some information, such as your AWS key, the area you want to use, the computing instances you want to start, and the number of them. Cortex will also let you know how much you will spend on the service of your choice. The whole process may take 20 minutes.

Train your model.

Cortex doesn't care how you create or train your model. In this example, we use the fast.ai library and the Oxford IIIT Pet dataset. This data set contains 37 different dogs and cats. Therefore, our model should divide each image into these 37 categories.

Create a trainer.py file similar to the following

Import boto3import picklefrom fastai.vision import * # initialize boto sessionsession = boto3.Session (aws_access_key_id=, aws_secret_access_key='',) # get the datapath = untar_data (URLs.PETS, dest='sample_data') path_img = path/'images'fnames = get_image_files (path_img) # process the databs = 64pat = ritual / ([^ /] +) _\ d+.jpg$'data = ImageDataBunch.from_name_re (path_img, fnames, pat Ds_tfms=get_transforms (), size=224, bs=bs)\ .normalize (imagenet_stats) # create, fit and save the modellearn = cnn_learner (data, models.resnet18, metrics=accuracy) learn.fit_one_cycle (4) with open ('model.pkl',' wb') as handle: pickle.dump (learn.model Handle) # upload the model to s3s3 = session.client ('s3') s3.upload_file (' model.pkl', 'cortex-pets-model',' model.pkl')

As with other python scripts, run the script locally: python trainer.py

However, be sure to provide your AWS credentials and S3 bucket name. This script takes the data, processes it, and fits a pre-trained ResNet model and uploads it to S3. Of course, you can use several techniques (more complex architecture, differentiated learning rates, more era-oriented training) to extend this script to make the model more accurate, but this has nothing to do with our goals. If you want to learn more about ResNet architecture, please refer to the following article.

Https://towardsdatascience.com/xresnet-from-scratch-in-pytorch-e64e309af722

Deployment model

Now that we have trained the model and stored it in S3, the next step is to deploy it to a production environment as a web service. To do this, we created a python script called predictor.py, like the following figure:

Import torchimport boto3import pickleimport requestsfrom PIL import Imagefrom io import BytesIOfrom torchvision import transforms# initialize boto sessionsession = boto3.Session (aws_access_key_id='', aws_secret_access_key='',) # define the predictorclass PythonPredictor: def _ _ init__ (self, config): S3 = session.client ('s3') s3.download_file (config [' bucket'], config ['key'] 'model.pkl') self.model = pickle.load (open (' model.pkl', 'rb')) self.model.eval () normalize = transforms.Normalize (mean= [0.485, 0.456, 0.406], std= [0.229, 0.224, 0.225]) self.preprocess = transforms.Compose ([transforms.Resize (224), transforms.ToTensor () Normalize]) self.labels = ['Abyssinian',' Bengal', 'Birman',' Bombay', 'British_Shorthair',' Egyptian_Mau', 'Maine_Coon',' Persian', 'Ragdoll',' Russian_Blue', 'Siamese',' Sphynx', 'american_bulldog',' american_pit_bull_terrier' 'basset_hound', 'beagle',' boxer', 'chihuahua',' english_cocker_spaniel', 'english_setter',' german_shorthaired', 'great_pyrenees',' havanese', 'japanese_chin',' keeshond', 'leonberger',' miniature_pinscher', 'newfoundland' 'pomeranian',' pug', 'saint_bernard',' samoyed', 'scottish_terrier',' shiba_inu', 'staffordshire_bull_terrier',' wheaten_terrier', 'yorkshire_terrier'] self.device = config [' device'] def predict (self Payload): image = requests.get (payload ["url"]). Content img_pil = Image.open (BytesIO (image)) img_tensor = self.preprocess (img_pil) img_tensor.unsqueeze_ (0) img_tensor = img_tensor.to (self.device) with torch.no_grad (): prediction = self.model (img_tensor) _ Index = prediction [0] .max (0) return self.labels [index]

This file defines a predictor class. When it is instantiated, it retrieves the model from S3, loads it into memory, and defines some necessary transformations and parameters. During reasoning, it reads the image from the given URL and returns the name of the predicted class. A method for initialization, _ _ init__, and a prediction method, predict, to receive the payload and return results.

The predictor script has two accompanying files. A requirements.txt file that records library dependencies (such as pytorch, fastai, boto3, etc.) and an YAML configuration file. The minimum configuration is as follows:

-name: pets-classifier predictor: type: python path: predictor.py config: bucket: cortex-pets-model key: model.pkl device: cpu

In this YAML file, we define which script to run for reasoning, on which device (such as CPU), and where to find the trained model. More options are available in the documentation.

Finally, the structure of the project should follow the following hierarchy. Please note that this is a minimum requirement, but if you have a deployable model, you can submit train .py.

-Project name |-trainer.py |-predictor.py |-requirements.txt |-cortex.yaml

With all this, you only need to run cortex deploy, and within seconds, your new endpoint can accept the request. Execute corted get pets-classifier to monitor endpoints and view other details.

Status up-to-date requested last update avg request 2XX live 1 113m-- endpoint: http://a984d095c6d3a11ea83cc0acfc96419b-1937254434.us-west-2.elb.amazonaws.com/pets-classifiercurl: curl http://a984d095c6d3a11ea83cc0acfc96419b-1937254434.us-west-2.elb.amazonaws.com/pets-classifier?debug=true-X POST-H "Content-Type: application/json"-d @ sample. Jsonconfigurationname: pets-classifierendpoint: / pets-classifierpredictor: type: python path: predictor.py config: bucket: cortex-pets-model device: cpu key: model.pklcompute: cpu: 200mautoscaling: min_replicas: 1 max_replicas: 100 init_replicas: 1 workers_per_replica: 1 threads_per_worker: 1 target_replica_concurrency: 1.0 max_replica_concurrency: 1024 window: 1m0s downscale_stabilization_period: 5m0s upscale_stabilization_period: 0s max_downscale_factor: 0.5 Max_upscale_factor: 10.0 downscale_tolerance: 0.1 upscale_tolerance: 0.1update_strategy: max_surge: 25% max_unavailable: 25%

All that's left is to test it with images of curl and pomeranian:

Curl http://a984d095c6d3a11ea83cc0acfc96419b-1937254434.us-west-2.elb.amazonaws.com/pets-classifier-X POST-H "Content-Type: application/json"-d'{"url": "https://i.imgur.com/HPRQ28l.jpeg"}' releases resources

When we complete the service and clustering, we should release resources to avoid additional costs. Cortex is easy to do:

Cortex delete pets-classifiercortex cluster down conclusion

In this article, we saw how to use Cortex, an open source platform, to deploy machine learning models as production web services. We trained an image classifier, deployed it on AWS, monitored its performance and tested it.

The above content is how to deploy and expand the machine learning model on AWS. Have you learned the knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.