In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
In this issue, the editor will bring you about the practice of Docker-based Serverless architecture how to understand the implementation and response of UCloud general computing products. The article is rich in content and analyzes and describes for you from a professional point of view. I hope you can get something after reading this article.
I. the evolution of computing
UCloud is a service provider that does IaaS (Infrastructure as a Service). IaaS essentially has three pieces: computing virtualization, network virtualization, and storage virtualization.
Among them, the virtualization of computing itself is mainly based on host uhost. Later, for some problems that cannot be solved by virtual machines, such as high-performance DB and other scenarios that require strong computing power, uphost is introduced to produce physical machines on the page. Later, for more traditional industries, the managed cloud project was launched, where users can host machines on the rack, and when the network, virtual machines and physical machines are connected, it is the so-called hybrid cloud architecture.
What is the next computing product?
Cloud computing can quickly obtain computing resources, network resources and storage resources through the network. The development of cloud computing itself has different forms: based on IaaS, similar products such as PaaS (platform based on service), BaaS (back-end as a service) and FaaS (function based on service) have emerged.
For users, PaaS basically solves the work of OPS deployment and provides a code hosting environment; but PaaS needs to write its own code, and BaaS has provided the service and can be called directly, but if many requests come, PaaS cannot be executed at the same time; FaaS executes functions according to the number of requests, and then destroys them, so multiple requests can be executed at the same time.
The concept of FaaS became popular this year in a blog post about Serverless on martinfowler in June. When AWS launched Lambda, he said: "AWS's understanding of an application is data + functions + events." Lambda supports Python, Java and note.js, you can write a function in its way, and you can set which events on its page to execute the function, which is also the specific form of FaaS.
Serverless has similarities and differences with the previous three. The traditional architecture is that business logic is placed on a digital Sever, but Serverless does not have a server. Authentication modules, DB, and computing are all placed on services provided by others. Their own clients only need to combine these services according to logic.
To put it simply, the vendor who provides this service will help you with Sever. You can just string it with the client logically, lowering the threshold for writing your own Server.
A large number of physical machines will be purchased when doing computing virtualization, but many physical machine resources are free, because the core part of making virtualization products is how to manage physical machine resources, and when users apply for a virtual machine, how to know which physical machine to assign the virtual machine to is appropriate. It is OK to distribute randomly or according to a certain proportion according to the load weight of the physical machine, but the final running efficiency of the virtual machine applied by each person is different, which will lead to the high and low load of the physical machine in the process of running. resulting in a large number of remaining resources.
Is there any room for improvement in this algorithm? How to squeeze out tens of thousands of servers, a lot of CPU and memory resources?
Demand:
The business runs on the virtual machine, and each time it expands, it has to apply for resources from the resource department.
The operation and maintenance department has a heavy workload of deployment and operation and maintenance.
How to provide the remaining resources in a more user-friendly way
Second, the implementation scheme of general computing: regenerate a very large virtual machine on this physical machine, specially for their use, so that the remaining resources can be occupied and waste can be avoided.
Problems with Virtualization:
Good isolation, but the scheduling granularity is too large.
The IO performance of UCloud is better than that of virtual machines in the industry, but the IO of virtual machines is supported to the local disk rather than based on the network disk, which puts forward high requirements for online migration, because the online migration of virtual machines based on network disks only needs to move the memory and the state of the whole device, while the local-based online migration needs to migrate all the data of the block devices, so if the disk is very large. This can lead to slow migration, difficult scheduling of virtual machines, or long time periods.
Resource-centric and service-centric
Applying for virtual machine deployment services is essentially resource-centric, knowing the ip of the virtual machine, deploying, monitoring it, and so on.
Customers need to be concerned about the deployment and monitoring of vm
Option 2: consider using Docker
Benefits:
Finer granularity
Light scheduling
Be apt to deploy
Computing resources are service-oriented and packaged to provide a service-centric perspective. If a program is packaged into Docker, the service, not the resource, is scheduled by Docker as a unit.
Cross-language communication. Any language packaged in accordance with Docker standards becomes a universal and portable thing with no restrictions on the language.
Option 3: adopt a combination of internal and external solutions based on the benefits of virtualization and Docker
Problem: Docker isolation problem, providing external services and security at the granularity of Docker, if there is a problem, it may damage the running environment.
Solution: adopt the mode of vm+ Docker, package the algorithm into Docker, and execute the algorithm in the way of restful.
Product architecture diagram:
ULB4 is a private network load balancer, which can achieve disaster recovery across data centers.
The purpose of openresty is to do grayscale publishing, when the program has an update, to direct the traffic somewhere.
Dotted wireframe is the complete logic of the product. Each set is equal, and each is an independent service. If one of them fails, it will not affect other services. You can use set to remove some important features:
API is responsible for providing services to the outside world
TaskManger is responsible for dispatching.
Executor is deployed for each virtual machine to execute Docker.
Interpretation of product terms
Docker image: algorithmic packaging of computing tasks. Submit Docker images to GeneralCompute's image repository for later computing tasks submission.
Task: through the http api submission provided by the computing service, the task content includes: Docker mirror path path, algorithm input. The output of the task is returned by api. Tasks are divided into synchronous tasks and asynchronous tasks:
1) synchronize tasks
For real-time computing, the client submits the task and waits for the api to return the task result.
2) Asynchronous tasks
It is used for offline and time-consuming computing. After the client submits the task, api returns the task id, and the client queries the task status and result according to the task id.
How the service is used:
The user packages the service into Docker and uploads it to the Docker repository, invokes the task through HTTP API, pulls up the Docker to complete the calculation, the Docker runs in the vm cluster, and the location and scheduling are completed by TaskManager.
Benefits:
Forget about resources, forget about deployment, and focus on algorithms.
Product advantages:
1) massive computing power
100, 000-core computing power, automatically and efficiently complete computing distribution, you can enjoy 100, 000-core computing power through a set of api.
2) take calculation as the center
Users don't need to care about resource location and environment deployment, they just need to focus on their own algorithms and task scheduling through api.
3) flexible and easy to use
The custom algorithm is packaged in the way of Docker, and the computing service is called in the way of API, which is easy to be integrated into the business process of the application. An important problem that has not been solved in paas is the problem of vendor binding, which is largely solved by the solution provided by docker.
4) pay on demand
Charge according to the actual consumption of computing resources, no need to pay the cost of non-computing resources (memory, disk), no need to worry about waste. This is probably the most attractive place for users.
5) automatic expansion + high concurrency
You can submit multiple API at the same time and run multiple algorithms at the same time. The algorithm runs on different nodes and is completely concurrent, but there is an implicit limitation that the task must be stateless, which is also the requirement of FaaS.
However, the advantages and disadvantages of Docker are obvious. Users should not pursue the latest technology in the process of using Docker, but should proceed according to their own needs.
Third, the application example of general computing 1. Picture processing practice-UFILE Image processing
U subscription le is a S3-like service of UCloud. Users provide an image, and ufile provides several image processing algorithms. The algorithm performs thumbnail, format conversion, rotation, watermarking and other operations on the image, and then returns a processed image to the user.
Demand:
Real-time performance
Cross-language algorithm
Large concurrency. Ufile has a large number of users.
The calculation of cymbals.
Achieve:
The image processing algorithm of U-shaped le is packaged into Docker. Based on the real-time calculation of data flow, the input stream is redirected to the standard input of Docker instance, the image processing algorithm reads the standard input and outputs the processing results to standard output, and the management program changes the Docker standard output into a HTTP stream to return to the user.
Common implementation illustration: the user uploads the picture to server through ufile api, finds that URL is different and calls different algorithms to process the picture, and then returns the picture to the user.
Use Serverless architecture to implement the illustration: package the algorithm as an Docker image and transfer the image to the image repository, and send the image through api. Api can specify the image corresponding to the algorithm and return the image.
Advantages:
A lot of physical machines are saved.
Save a lot of manpower. You don't need to write server, just write the image, and you don't have to worry about where the resources are, and how to expand and watch the image.
two。 Picture processing practice-OCR recognition
The function of OCR recognition is to scan the topic into pictures and identify the content of the topic by means of opencv and machine learning.
Demand:
16 million of the pictures every day.
The processing time of each picture is 2: 3 seconds.
Algorithm optimization relies on special CPU instructions
Achieve:
The opencv and machine learning model are packaged into docker images, and the computing nodes support special CPU instructions and execute concurrently.
Common implementation diagram: the user sends the picture to the central Server, which is a persistent service in which the logic identifies the image and returns it to the physical machine cluster, but the threshold for writing Server is too high for machine learning personnel.
Use Serverless architecture to implement illustration: after packaging opencv and machine learning model into Docker image, there is no Server. Users can directly send the general api of the image, which is equivalent to the online service of machine learning model.
Advantages:
Save a hundred machines
The optimization of the algorithm does not consider concurrency, but only needs to focus on the processing efficiency of a single image.
The above is the practice of Docker-based Serverless architecture shared by the editor on how to understand the implementation and response of UCloud general-purpose computing products. If you happen to have similar doubts, please refer to the above analysis. If you want to know more about it, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 234
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.