In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces the relevant knowledge of "how to build a machine learning system on Kubernetes". The editor shows you the operation process through an actual case. The operation method is simple, fast and practical. I hope this article "how to build a machine learning system on Kubernetes" can help you solve the problem.
What is Kubeflow Pipelines?
Kubeflow Pipelines platforms include:
Administrative console that can run and track experiments
Workflow engine (Argo) capable of performing multiple machine learning steps
SDK used to customize workflows. Currently, only Python is supported.
The goal of Kubeflow Pipelines is:
End-to-end task scheduling: supports the choreography and organization of complex machine learning workflows, which can be triggered directly, regularly, events, or even by changes in data
Simple experiment management: help data scientists experiment with a wide range of ideas and frameworks, as well as manage a variety of experiments. And realize the easy transition from experiment to production.
Easy reuse through componentization: quickly create end-to-end solutions by reusing Pipelines and components without having to rebuild each time starting at 0.
Run Kubeflow Pipelines on Ali Cloud
When you see the ability of Kubeflow Piplines, do you want to see it? But at present, there are two challenges in using Kubeflow Pipeline in China:
Pipelines needs to be deployed through Kubeflow; while Kubeflow has too many default components, it is also complicated to deploy Kubeflow through Ksonnet
Pipelines itself is deeply coupled to Google's cloud platform and cannot run on other cloud platforms or bare metal servers.
In order to facilitate domestic users to install Kubeflow Pipelines, Aliyun CCS team provides a Kustomize-based Kubeflow Pipelines deployment solution. Unlike normal Kubeflow basic services, Kubeflow Pipelines needs to rely on stateful services such as mysql and minio, which means you need to consider how to persist and back up data. In this example, we use Ali Cloud SSD cloud disk as a data persistence solution to automatically create SSD cloud disks for mysql and minio, respectively. You can try to deploy the latest version of Kubeflow Pipelines separately on Aliyun.
prerequisite
You need to install kustomize
In Linux and Mac OS environments, you can execute
Opsys=linux # or darwin, or windowscurl-s https://api.github.com/repos/kubernetes-sigs/kustomize/releases/latest |\ grep browser_download |\ grep $opsys |\ cut-d'"'- f 4 |\ xargs curl-O-Lmv kustomize_*_$ {opsys} _ amd64 / usr/bin/kustomizechmod uplix / usr/bin/kustomize
In the Windows environment, you can download kustomize_2.0.3_windows_amd64.exe
To create a Kubernetes cluster in Ali Cloud CCS, please refer to the documentation.
Deployment process
Access the Kubernetes cluster through ssh. For more information, please see the documentation.
Download the source code
Yum install-y gitgit clone-- recursive https://github.com/aliyunContainerService/kubeflow-aliyun
Security configuration
Configure the TLS certificate. If you do not have an TLS certificate, you can generate it with the following command
Yum install-y openssldomain= "pipelines.kubeflow.org" openssl req-x509-nodes-days 365-newkey rsa:2048-keyout kubeflow-aliyun/overlays/ack-auto-clouddisk/tls.key-out kubeflow-aliyun/overlays/ack-auto-clouddisk/tls.crt-subj "/ CN=$domain/O=$domain"
If you have a TLS certificate, please save the private key and certificate to kubeflow-aliyun/overlays/ack-auto-clouddisk/tls.key and kubeflow-aliyun/overlays/ack-auto-clouddisk/tls.crt respectively
3.2 configure the login password for admin
Yum install-y httpd-toolshtpasswd-c kubeflow-aliyun/overlays/ack-auto-clouddisk/auth adminNew password:Re-type new password:Adding password for user admin
First, use kustomize to generate deployment yaml
Cd kubeflow-aliyun/kustomize build overlays/ack-auto-clouddisk > / tmp/ack-auto-clouddisk.yaml
Check the region and availability zone where the Kubernetes cluster node is located, and replace the availability zone according to its node. Assuming that your cluster is located in cn-hangzhou-g, you can execute the following command
Sed-i.bak 's/regionid: cn-beijing/regionid: cn-hangzhou/g'\ / tmp/ack-auto-clouddisk.yamlsed-i.bak' s/zoneid: cn-beijing-e/zoneid: cn-hangzhou-g/g'\ / tmp/ack-auto-clouddisk.yaml
It is recommended that you check whether the / tmp/ack-auto-clouddisk.yaml modification has been set.
Replace the container image address from gcr.io to registry.aliyuncs.com
Sed-i.bak 's tmp/ack-auto-clouddisk.yaml gcr.iomax registry.aliyuncs.com
It is recommended that you check whether the / tmp/ack-auto-clouddisk.yaml modification has been set.
Adjust the amount of disk space used, for example, you need to adjust the disk space to 200g
Sed-i.bak 's/storage: 100Gi/storage: 200GG'\ / tmp/ack-auto-clouddisk.yaml
Verify the yaml file of pipelines
Kubectl create-validate=true-dry-run=true-f / tmp/ack-auto-clouddisk.yaml
Deploy pipelines with kubectl
Kubectl create-f / tmp/ack-auto-clouddisk.yaml
To see how we access the pipelines, we expose the pipelines service through ingress. In this example, the access to IP is 112.124.193.271. The link to the Pipelines Management console is: https://112.124.193.271/pipeline/
Kubectl get ing-n kubeflowNAME HOSTS ADDRESS PORTS AGEml-pipeline-ui * 112.124.193.271 80,443 11m
Access the pipelines Management console
If you use a self-issued certificate, you will be prompted that this link is not a private link, please click to display details, and click to visit this website. Please enter the user name admin and the password you set in step 2.2.
At this point, you can use pipelines to manage and run training tasks.
Quan A
Why is Aliyun's SSD cloud disk used here?
This is because Aliyun's SSD cloud disk can be automatically backed up on a regular basis to ensure that the metadata in pipelines will not be lost.
How to perform cloud disk backup?
If you want to back up the contents of the cloud disk, you can manually create a snapshot for the cloud disk or set an automatic snapshot policy for the hard disk to create snapshots automatically on time.
How do I clean up my Kubeflow Piplines deployment?
The cleaning up here is divided into two parts:
Remove components of Kubeflow Pipelines
Kubectl delete-f / tmp/ack-auto-clouddisk.yaml
Release the corresponding two cloud disks for mysql and minio storage respectively by releasing cloud disks
This is the end of the introduction to "how to build a machine learning system on Kubernetes". Thank you for reading. If you want to know more about the industry, you can follow the industry information channel. The editor will update different knowledge points for you every day.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.