In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly shows you "what's the difference between Borg and Kubernetes". The content is simple and clear. I hope it can help you solve your doubts. Let the editor lead you to study and learn this article "what's the difference between Borg and Kubernetes".
Hello, everyone. I am Zhong Cheng from Huawei PaaS Department. I am currently doing some related product research and development. The topic I want to share is from Borg to Kubernetes. In fact, Borg is the predecessor of Kubernetes. Today, I will mainly talk about three aspects, the first is the introduction of Borg, the second is what changes Kubernetes has made based on Borg, and its development direction, and the third topic would like to talk about what kind of product or shape the cloud may need in the future.
What is Borg? What problem does it solve?
Let's first look at the first topic, that is, what is Borg? What problem does it solve?
Let's take a look at this picture. It's from a movie called Star Trek. I'm sure most of you have seen it. Borg is a kind of alien inside, the villain, what does he do? He will come into contact with other civilizations and seize your civilization, and then it will assimilate you, transform you, transform you into a half-human, half-machine monster, and you will become a part of their civilization. and then he continues to expand in the universe. I think this is a very cool race. Borg is named after its large-scale distributed integrated management system. He hopes that their systems can also assimilate different machines into their own machines, and then run their own programs.
For Google, Borg is a top-level integrated management system. On it, most of Google's applications and frameworks, including Gmail, Google Docs and Web Search, are directly facing customers. It also includes some underlying frameworks, (MR), including some of its GFS distributed storage systems. In other words, you can assume that all applications need it to manage the underlying physical machines. Borg has been successfully applied to Google for more than a decade.
You can take a look at the overall architecture of Borg. It is also a typical distributed platform architecture, which is a logical master, and then there are many nodes below, each with some of its agents. Here you can take a look at how to use this system as an engineer inside Google. They use Borgcfg (command line) or Web UI to submit applications that need to run (Task) to the system, and Task can be anything, such as an application, or a batch task, or a (MR) task. He submits the task to BorgMaster via Borgcfg, and then BorgMaster accepts the request, puts the request in the queue, and Scheduler scans the queue to see the resource requirements of the application, and looks for matching machines in the cluster. In fact, when you submit this task, how many resources it needs, what kind of machine you need, and how long it will take to run, he will make a match and see which machines are idle at the bottom. Then assign the task to the machine and start running.
This is the overall framework of a Borg. A typical startup time is 25 seconds from the time the user submits the application to the time the application starts. 80% of the time is the time when the application package is downloaded on each node. You can see that this time is very fast, its scheduling time is less than 5 seconds, of which 20 seconds are spent on the transmission layer.
On the scheduling principle of BorgMaster
I would like to discuss later. One of the key points I want to talk about today is about BorgMaster. It has a lot of applications here. How does it transfer the priority of this application? Or, in the end, which machine should run which application? The practice of Borg is to estimate the resources of a single Task. You can see that there are several lines here. The outermost dotted line on one of the machines is a quota for a resource submitted by the user. That is, no matter how Task can run, it cannot exceed its limit. This is a rigid limit. If it exceeds this limit, it will be restricted to prevent it from running. This is just the number submitted by the user. We all know that the number submitted by the user is often inaccurate, and you can't predict how much CPU and memory your program will consume on your system. How does Borg evaluate this resource? He carried out a resource recovery work 300 seconds after the launch of Task. You can see that there is a yellow area in the middle, and the yellow area is the resources actually consumed by the application. Then it will slowly push it in from the outside, to the place of the green zone, and to the place of the green zone to draw a line. This line is the so-called reserved resources, that is, the Borg system believes that your application is the resources needed for long-term and stable operation.
Here's the question: why would Borg do that? The reason is to free up the resources in the rest of the area, if I know that this application actually uses so many resources. Then after I draw a certain security line for it, I can dispatch the remaining resources, that is to say, they can be used by other applications.
The green one is made up of yellow blocks and some security zones, recalculating how much resources the application consumes every few seconds. This is actually a dynamic process, and it does not mean that it can never change after it has been crossed out. The green square can be extended all the way to the outside of the dotted line. This is a strategy for a single Task. Then he made a distinction between the applications running on the system. That is to say, he first thought about what applications there were and what characteristics they had. One kind of application is the so-called production application, prod task, which is characterized by non-stop, it is a long process, it is always user-oriented, such as Gmail or Web Search, it can not be broken in the middle, its response time is a few microseconds to hundreds of milliseconds. Then this task means that you have to give priority to ensuring that it is running, and its short-term performance fluctuations are sensitive.
Another type of task is the so-called non-prod task, which is a batch task, similar to Map Reduce, which is not directly oriented to users and is not very sensitive to performance. It is over after running, and the next task is the next, not a long process.
Why do you distinguish between tasks?
When the resource task consumption of prod task is relatively large, for example, a lot of people suddenly come to a website, the server memory CPU of this site will be very high. At this time, when the application resources on this machine are insufficient, he will kill Non-prod task and let it run on other machines. But in your spare time, you can keep the task back. In this way, I can make full use of the resources on this machine at all points in time, and I can fill up these things. In the end, Google's test results show that about 20% of workloads can be recycled. This figure is actually very large. With so many machines for Google, you can save 20% of your resources, which is a lot of money for it.
The value of Borg
I would like to summarize a little bit about the value that Borg provides to Google. It mainly provides three aspects. The first is the hidden details of resource management and fault handling, allowing users to focus on application development. Users don't have to worry about how the underlying system operates, even if I hang up, he will help me start it up. The second is to provide high reliability and high availability operations, and support applications to achieve high reliability and high availability. The third is to run with high resource utilization on tens of thousands of machines.
As for how Borg does these three aspects, google has a long paper on "using Borg to manage large-scale clusters in Google", which contains a lot of details, which will not be discussed today.
Kubernetes architecture
Since Google launched the Borg system, it has been very successful internally, but in the outside community, no one knows exactly how it is done or how it is implemented internally. Later, the people who did Borg made another software, that is, Kubernetes,Kubernetes generally speaking, you can think of it as an open source version of Borg, but there are some differences between Kubernetes and Borg, which I will talk about later. This is the architecture of Kubernetes. In fact, you can see that its architecture is basically similar to that of Borg, including how users use it. Users submit tasks by using a command-line tool such as kubectl.
The difference between Kubernetes and Borg
Borg has been running at Google for ten years, and the size of the machines is so large that there are 10, 000 or more of them in a cluster. Kubernetes only came out in 2014. I personally think this is aimed at Amazon. Amazon's public cloud is very successful. Google also wants to enter this field. His way is to open source the Kubernetes system and have a certain influence in the industry, so that everyone can use it. In this way, I can compete with Amazon later, which is one of their ideas.
The underlying Borg uses a lxc container, while Kubernetes uses a Docker container. Borg is written in C++ and Kubernetes in the GE language. Borg has done a lot of optimization in the performance of cluster scheduling, but Kubernetes has not done a lot of optimization. At present, it is still quite corny in this aspect, and there is still a lot of work to be done. There are tens of thousands of machines that can be dispatched by a single cluster of Borg, but only a few hundred can be supported by Kubernetes. This is the current data.
Then let's see what the difference is for the users of the two systems. Borg users are actually a group of Google engineers. We all know that Google engineers are the top engineers in the world. When they wrote this program, they considered that the program would run on the cloud. He knew that the program was distributed. When he wrote this application, he would do a lot of optimization for this system, and he knew that I should do a distributed system at the time of design. But Kubernetes, he wants to do more, that is, in addition to running these distributed systems, he also wants to be able to support some. He first supports these containers of Docker, but he also hopes to support these applications written by more traditional, less skilled people. He has done some work in this respect. One is to use a Docker container, which supports a lot of things. In addition, it can also mount the external persistence layer, that is, you can hang some distributed systems on that system. My container reads external distributed storage. In this way, even if my container is hung up, my data can be saved more safely. In addition, he provides some monitoring and some log functions. But there are still some questions about whether these functions are enough. Later, if I want to use Kubernetes to run some traditional applications, then I will certainly do some modification to these applications and systems, but at least it is not too difficult to complete.
This is some of the features of its Kubernetes design. Kubernetes's network architecture is that each Pod has a separate IP, which is more friendly. People who write applications don't have to think about conflicts. There is also the mode of grouping, that is, how these containers are grouped. Borg is a more expert system, it has more than 230 parameters, but Kubernetes is very simple, about three or four description files are over.
Visual Summary of Borg and Kubernetes
Here is my visual summary of Borg and Kubernetes. Borg is a jet driving system, very professional and high-end, it is suitable for a big company like Google, it has millions of machines. Kubernetes is a simplified version of it, it is a well-designed car, it is suitable for small and medium-sized companies, use it to dispatch their own clusters.
In the future, some corresponding work will be done on the Kubernetes side, including multi-tenant support, including container persistence, cluster size improvement, utilization and network, and so on.
These are all the contents of this article entitled "what's the difference between Borg and Kubernetes". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.