In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/02 Report--
In the afternoon session of ECUG on December 22, Yuan Xiaopei, Technical Director of Qiniu Cloud Computing Department, brought you a wonderful sharing on the theme "DCOS Road based on K8S" and introduced the current situation and product thinking of Qiniu Container Cloud.
At the same time, he talked about how to set up a K8S translation team to translate the book "Kubernetes in Action" through the containerization of Qiniu public cloud business.
The following is a transcript of the speech.
Cdn.xitu.io/2019/1/9/168308d167eaf92e?w=1920&h=1280&f=jpeg&s=797215 ">
Good afternoon, everyone! I am Yuan Xiaopei, Technical Director of Qiniu Cloud Container Computing Department. What I want to share today is Qiniuyun's DCOS road based on K8S. Combined with some practical experience, I would like to talk about what we do in it and our thinking at the product level.
Today, I will first talk about the containerization process of Qiniuyun's internal business, and then talk about DCOS based on K8S. Qiniuyun is working on a more powerful K8S distribution, which is reflected in three places: first, the stability of the underlying technology. To make a more powerful K8S, the technical stability is the first concern of the user; the second is the expansibility of the function. If a system does not meet the requirements, it is necessary to consider its expansibility and whether it can be implemented based on an open interface. Third, it is easy to use. K8S is a very complex system, how to ensure its operation and maintenance, the use is very easy, so that end users can get started quickly. Finally, it will briefly introduce the ecology around K8S, including upstream and downstream and applied ecology, as well as applied ecological richness.
The container process inside Qiniuyun
Qiniuyun began to make containers in 2014, when the project was called QCOS, and its full name is "Qiniu Cloud Operating System". At that time, we read through the design draft of K8S, and we thought we were capable of making such a system. Then we developed a container cluster scheduling management system from scratch, which we did for two years. Looking back at the end of 2016, we found that a container cluster scheduling management system has a lot of things to do, including CPU and memory scheduling (computing power scheduling), network management, storage plug-ins, application-related processing (log, monitoring, alarm). This is a very large system, and few companies have the ability to do this from scratch. At the time, the only possibility was Google, because they had a set of BORG services for years, and K8S evolved from the BORG system design concept. So we decided at the end of 2016 to fully switch to K8S and be 100% compatible with K8S.
It is the end of 2018, and what we have done in the past two years is the containerization of the five major businesses within Qiniuyun: the first is the test system, and now the test of Qiniuyun has been fully capacitated; the second is the multimedia transcoding system, which is also full-capacitive. The third business is Qiniuyun AI business. AI has a large number of GPU machines and needs to do AI learning and training based on a large number of data sets, so we have built a machine learning platform based on Kubernetes, and we have done a lot of extended features for this platform; the fourth is big data, big data Spark business in our container application market, as an application that allows users to deploy quickly.
We are doing something enthusiastically this quarter to move the core object storage business of Qiniuyun to the container cloud, which has been preliminarily verified and is in the process of being measured. So far, several major business lines of Qiniuyun have a large number of applications running on containers. Since the second half of 2018, we have exported container products to some external customers, combined with Qiniuyun's experience in containerization in the past five years, to turn our good technologies, ideas and functions into products for export.
What does containerization bring?
What exactly does containerization bring to our business?
First, employees are happy. The efficiency of delivery deployment has been greatly improved. Originally, from the submission of a line of code to the test to the launch of production, it may take several days or even weeks, and after it is launched, it may be unstable to roll back. Based on container technology, the whole process can be integrated with the concept of CICD and DevOPS. After a line of code is submitted, the code changes can be compiled into a mirror automatic run unit test. After running the unit test, you can run the code static check. You can also add some custom scripts, then integration testing, and finally CD continuous deployment and online connection. The release cycle can range from days, weeks to minutes or even seconds. In this process, the simplest change is that employees are very happy, development, testing, operation and maintenance, can leave work early, do not have to wait until the business peak at 4: 00 in the morning to release.
Second, users are happy. The efficiency of operation and maintenance has been greatly improved. A container platform provides container monitoring by default, system-level CPU and memory monitoring, entry-level monitoring alarms, and even logs can be collected automatically. After a well-written back-end application is running, the platform can provide it with some basic log monitoring. If the business makes some adaptation, business-level monitoring can also be collected. These are integrated into the link-wide log monitoring and alarm mechanism. If there are problems online, based on monitoring logs and alarms, you can greatly reduce the time from error detection to error resolution, reduce MTTR, and improve the availability of applications. When the usability of the application is improved, the impact on the customer will be less and less, in essence, the customer will be happier.
Third, the boss is happy. Because the utilization rate of machine resources has been greatly improved. In a data center, a K8S cluster can be used to manage all physical resources, allowing all businesses to do business mixing in a large pool of computing resources, and then the utilization rate is improved, from less than 20% of the original resource utilization. The highest can be raised to 60% or 70%. As far as the company is concerned, it reduces the enterprise IT cost. From this point of view, no boss will be unhappy.
Data Center operating system based on K8S
After doing so much, we reap a lot of benefits, and then we think about what we are essentially doing. In essence, we are building a data center operating system. It turns out that the management of a bunch of machines in the computer room is essentially through the machine IP plus a SSH port number. No matter what deployment tool is used, it is to push some applications up, change the configuration, start the application, and deal with these machines through the IP address and SSH port. But with the container, the interface that interacts with the data center has completely changed. We use programmatic interfaces to operate and schedule the CPU and memory of the data center, regardless of which machine the business is scheduled to, and we can even operate logs, monitoring and alarms in the way of API. So, we are working on a data center operating system.
K8S is already a data center operating system, and we are working on a more powerful K8S distribution.
What does a more powerful K8S release need to have?
We have summed up three points: the first is the stability of the underlying technology, and commercial companies stand behind it to ensure that the technology is more stable; the second is rich expansion functions; and the third is ease of use. Whether it is easy to deploy, expand and upgrade for platform operation and maintenance, and whether it is easy for end users to use, is very important. In addition, a perfect operating system, in addition to its own functions, but also needs to provide the necessary upstream and downstream services and upper applications.
In terms of the stability of the underlying technology, we iterate every day. This is what we have been doing in the last month or two. On the network model, etcd is deployed separately and does not affect each other with K8S etcd, using etcd V3 API as a database, performance is improved by two times; using BGP route refletor, Full Mesh is turned off, and performance is greatly improved.
For example, you may not pay much attention to the point: KubeDNS, the performance of the community default version is only 99.5%, which means that it may last more than 3 hours when you are not working. We have made a series of improvements to increase KubeDNS availability from 99.5% to 99.999%, with no more than 25 seconds of unavailability per month. In fact, the unavailable time in the past three months is 0.
Some people may ask Qiniuyun why it takes such a small thing seriously and does so many things. For the container cloud team, each component is optimized in this mindset. Because every request that the user calls to Qiniuyun, every file that exists here is trust to us. We have a commitment to users that we will make the system as close to 100% availability as possible.
In terms of expansion, the system level is optimized for Nvidia GPU monitoring and scheduling, and the custom scheduler ensures the scheduling performance of GPU. A K8S integrated AlluXIO storage plug-in is developed, through which AI can be trained to use AlluXIO to cache massive files. Second, we launched a new Kubernetes SIG on the static local disk, based on which many people can contribute code together to do a good job of static local disk supply. SDN is implemented on the network based on vlan/vxlan, supporting layer 2 network isolation.
This is an automatic log collection scheme based on CLI. The main reason is that a container often needs to log to multiple directories. However, according to container standards, you can only log to standard output and standard error, so it is difficult to collect logs that extend multiple directories. CLI is an acronym for Container Logging Interface, which enables the entire solution to interface with any existing log solution, such as Pandora that can be interfaced with Qiniuyun, and can also support ELK and Splunk.
This is how the log collection scheme is used.
As a container platform or data center operating system, the key point is ease of use.
The operation and maintenance of the platform is critical. Many container products are concerned that this product is easy to use, but in fact, the operation and maintenance of the container platform is more difficult, because K8S manages the entire data center. When the business is large, the operation and maintenance of the K8S platform itself is more important.
The first is the convenience of deployment, expansion and upgrading. Our goal is to deploy a new cluster in 5 minutes and expand a new node in a second. K8S is an open source ecology, open source has a lot of security issues, how we upgrade the new version, so that the current cluster business will not be affected, this is a very key factor.
The second is cluster information visualization, including host, key components of the system, monitoring, logging and alarm of L7 and L4 entrances. How to find the problems at the machine or system components or entrance level through the platform monitoring information, and how to quickly locate the business problems through the platform are all things that cluster visualization needs to do.
The third is the automatic handling of common faults, we have implemented a mechanism that can automatically detect faults and solve them with one click.
This is the platform operation and maintenance management interface.
This is a tool with automatic operation and maintenance mechanism, and this tool can be run on every node. It can detect automatically, plug-ins automatically detect known phenomena, report these to K8S api-server, and take measures to automatically fix some problems.
In terms of ease of use, we take the project as the center and have a strong ability to manage and control users. Traditional projects are first virtual projects, then people, then machines and software. We manage the project exactly in this way, adding K8S resource space to the project as a bunch of resources. There may be item management, operation and maintenance, testing and development in the project, and each different person has different decision-making authority.
Then there is the powerful application choreography ability. Many K8S platforms sacrifice compatibility in order to pursue the ease of choreography. After users are familiar with this platform and K8S enough, users may require the platform to be fully compatible with K8S, which means that the resources created through K8S API must be displayed through the platform, or the resources created and choreographed through the platform can be operated through K8S API, both forward and reverse. So our products are 100% compatible with native K8S and compatible with kubectl.
The last one is the powerful ability to manage image space, such as foreign image synchronization, image acceleration, private image hosting; C2I: build deliverable images based on code, unit test and static check; image check, check images based on some tools to see if there are security risks in it.
These are the points that users need to consider in terms of ease of use.
This is the interface for users to use the product, application choreography, application list, application service choreography.
A strong data center operating system also needs a relatively perfect surrounding ecology. Continuous deployment can be connected with Kubernetes, which can be connected through HELM, and ISTIO in the upper right corner. Traffic management features based on ISTIO, such as grayscale publishing, circuit breaker, and link tracking, can help us quickly find problems.
Qiniuyun implements common database middleware services based on K8S, including MySQL, MongoDB, Redis and RabbitMQ. These operation and maintenance tools are essentially highly available services implemented by Operator, which support one-click deployment, can do regular backup and recovery, and ensure the reliability of data. In essence, it is to move the public cloud RDS services to K8S, so that non-public cloud users use high-quality database services, greatly reducing the workload of DBA.
This is Qiniuyun's current global ranking in the K8S community, and we are ranked 26th. Some people may ask, how many people are in open source? my answer is that none of us do open source full-time, nor do we regard it as a very deliberate thing. It is to contribute something in the process of maintaining the stability of K8S, expanding functions, and improving product usability. Because we benefit from open source, we try to contribute directly to open source for these improvements and new features. But we don't do three things: don't correct spelling mistakes, add unit tests, and change comments.
This is a book translated by the Qiniu Container Cloud team and some friends within Qiniuyun. The title of the book is "Kubernetes in Action" (Chinese version purchase link: https://item.m.jd.com/product/12510666.html). It mainly teaches us how to deploy distributed container applications on Kubernetes. The author of this book is Marko Luksa, he is a Red Hat OpenShift engineer. The book was prefaced by Qiniuyun CEO Lao Xu himself. In the process of translation, according to the practical application experience, the Qiniu Container Cloud team translated the book as accurately as possible and easy to understand.
Kubernetes is Greek, meaning helmsman, leading a ship to the right place. I hope this book, like a helmsman, will be of some help in the process of learning K8S.
(Marko Luksa shares footage live on ECUG Con 2018)
Follow the official account: Qiniuyun
Click "read the original text" at the end of this article to get
Marko Luksa
The full version of the wonderful speech DEMO!
(in order to ensure timely access to the video, please send a screenshot of the "successful submission" page to the official account backend)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.