In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly explains "what are the misunderstandings and scenarios of using containers". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "what are the misunderstandings and scenarios of using containers"!
Misunderstanding 1: the container starts quickly and starts in seconds.
This is a common saying when many people preach containers, often people will launch an application such as nginx, and it will be launched very soon.
Why the container starts fast, one is that there is no kernel, and the other is that the image is relatively small.
However, the container has a master process, that is, Entrypoint. Only when the main process is fully started, the container can be really started. A metaphor is that the container is more like human clothes. When people stand up, clothes stand up, people lie down, and clothes lie down. Clothes have a certain degree of isolation, but the isolation is not so good. Clothes have no root (core), but clothes can walk around with people.
So does it make sense to judge the startup speed of a container according to a nginx? For Java applications, tomcat is installed, and tomcat starts, loads war, and the real application starts. If you stare at the tomcat log, it takes some time, not seconds at all. If it takes a minute or two for an application to start, it doesn't make sense to just talk about the second start of the container.
Now the startup speed of VM in OpenStack is optimized faster and faster, when starting a VM
Originally need to download the virtual machine image from Glance, and then there is a technology, in the case that Glance and the system disk share Ceph storage, the virtual machine image does not need to be downloaded, and the startup speed is much faster.
And the reason why the container starts quickly, it is often recommended to use a very small image, such as alpine, in which a lot of things are cut out, so the startup speed is even faster.
The virtual machine image of OpenStack can also be clipped to achieve fast startup.
We can finely measure every step of virtual machine startup, cut out the corresponding modules and the startup process, and greatly reduce the startup time of the virtual machine.
For example, in a blog post on UnitedStack, https://www.ustack.com/blog/build-block-storage-service, we can see this implementation and description:
"it takes 1 to 3 minutes to create a virtual machine using a native OpenStack, while it takes less than 10 seconds to use a modified OpenStack. This is because nova-compute no longer needs to download the entire image through HTTP, and the virtual machine can be started by directly reading the mirror data in the Ceph."
Therefore, for the overall startup time of the virtual machine, now the optimization is good, generally can achieve less than ten seconds to half a minute. Compared with the startup time of Tomcat, this time is actually not a burden. Compared with the startup speed of containers, there is no qualitative difference. Some people may say that faster startup speed is also faster, especially for online environment hang-up self-repair, isn't it every second? With regard to the problem of self-repair, we will talk about it separately below.
However, the virtual machine has an advantage, is good isolation, if the container is clothes, the virtual machine is the house, the house stands there, whether the people are standing or lying, the house is always standing, the house will not follow people. Using a virtual machine is like people living in an apartment, each has a complementary interference, using containers like people wearing clothes crowded into the bus, seems to be isolated, who broke the bus, no one can leave.
To sum up, the startup speed of the container is not enough to constitute an obvious advantage over the OpenStack virtual machine, but the isolation of the virtual machine kills the container.
Myth 2: containers are lightweight, and each host runs hundreds of containers
Many people will do experiments and even tell customers how awesome the container platform is. You see, we can run hundreds of containers on one machine, and virtual machines can't do that at all.
But is there a real application scenario in which a machine runs hundreds of containers? For the container, the important thing is the application inside. The core of the application lies in stability and high concurrency support, not density.
In many lectures and meetings, I met many well-known lecturers dealing with Singles Day and 618, and generally reported that the current Java application is basically standard with 4core 8G. If there is a lack of capacity, a small number of them are carried out through vertical expansion, and most of them are carried out by horizontal expansion.
If 4core 8G is standard, less than 20 services can fill a physical server. Is it interesting for a machine to run hundreds of nginx? This is not a serious usage scenario.
Of course, there is a popular Serverless serverless architecture in which all custom code is written and executed as isolated, independent, and often fine-grained functions that run in stateless computing services such as AWS Lambda. These computing services can be virtual machines or containers. For stateless functions, you need to quickly create and delete, and it is likely that the time to execute a function itself is very short, in which case the container has some advantages over the virtual machine.
At present, the non-service architecture is more suitable for running some task-based batch operations, making use of the horizontal elasticity at the process level to offset the high cost of process creation and destruction.
In the integration of spark and mesos, there is a Fine-Grained mode. Unlike the usual execution of big data, the task execution process has already applied for resources, and so on. This mode allocates resources only when the task is assigned. The advantage is the flexible application and release of resources, but the disadvantage is that the creation and destruction of the process is still too granular. So the performance of spark running in this mode will be worse.
This idea of spark is similar to the non-service architecture. You will find that when we were learning the operating system, we said that the granularity of the process was too large, and it would be too slow to create and destroy processes every time. For high concurrency, later there were threads, and the creation and destruction of threads were lightweight. Of course, it was still slow, so we had a thread pool and created it there beforehand. You don't have to create it now when you use it, and you just hand it back when you don't use it. Later, I still feel slow, because the creation of threads also needs to be completed in the kernel, so later, there is a cooperative program, all thread switching in the user mode. For example, when AKKA,Go uses a cooperative program, you will find that the trend is for high concurrency, and the granularity is getting finer and finer. Now many situations require process-level, with a sense of twists and turns.
Myth 3: the container has a mirror image, can maintain the version number, and can be upgraded and rolled back
The container has two features, one is encapsulation and the other is standard. With the container image, you can encapsulate the various configurations, file paths, and permissions of the application, and then, as Sun WuKong said, "yes", at the moment when it is sealed. The image is standard, and running the same image in any container environment can restore the moment at that time.
The image of the container also has a version number. We can upgrade according to the version number of the container. If there is an error in the upgrade, you can roll back according to the version number. After the rollback is completed, you can ensure that the interior of the container is still in its original state.
But the OpenStack virtual machine also has a mirror, and the virtual machine image can also be snapshot. When you hit snapshot, it will save all the state at that moment, and snapshot can also have a version number, can also be upgraded and rolled back.
It seems that the container has all these features, the OpenStack virtual machine. What's the difference between them?
The virtual machine mirror image is large, while the container image is small. Virtual machine mirror images are often dozens of gigabytes or even hundreds of gigabytes, while container images are hundreds of megabytes more.
Virtual machine images are not suitable for cross-environment migration. For example, the development environment is local, the test environment is on one OpenStack, and the development environment is on another OpenStack. The migration of virtual machine images is very difficult, and very large files need to be copied. The container is much better, because the mirror image is small and can be quickly migrated from different environments.
Virtual machine images are not suitable for cross-cloud migration. Currently, no public cloud platform supports downloading and uploading of virtual machine images (for security reasons, piracy reasons), so an image is directly between different clouds or different region of the same cloud, and cannot be migrated, so you can only create a new image, so the consistency of the environment cannot be guaranteed. The image center of the container is independent of the cloud. As long as you can connect to the image center, you can download it to any cloud. Because the image is small, the download speed is fast, and the image is layered, you only need to download the different parts each time.
OpenStack's optimization of mirroring basically works in one cloud. Once it spans multiple environments, mirroring is much more convenient.
Myth 4: containers can be automatically restarted using container platform management to achieve self-repair.
The self-healing function of the container is often boasted. Because the container is clothes, people lie down, clothes lie down, the container platform can immediately find people lying down, so they can quickly wake up to work. The virtual machine is the house, the person lies down, and the house is still standing, so the virtual machine management platform does not know whether the people inside can work or not, so the container will be automatically restarted when the container is hung up, while the application in the virtual machine will be hung up, as long as the virtual machine is not hung up. Probably no one knows.
All these statements are true, but people gradually discover another scenario, that is, the application in the container is not hanging, so the container seems to be still running, but the application and non-working do not respond. When the container is started, although the state of the container is up, it will take a while for the application in it to provide service. Therefore, in view of this scenario, the container platform will provide health check for applications in the container, depending not only on whether the container is available, but also on whether the application in the container can be used. If not, it can be restarted automatically.
Once health check is introduced, it is not different from the virtual machine, because with health check, the virtual machine can see whether the application is working or not, and the application can be restarted if it does not work.
What's more, the container starts quickly and starts in seconds. If you can restart and repair automatically, it will be repaired in seconds, so the application is more available.
This view is certainly not true, and the high availability of the application is not directly related to the speed of the restart. High availability must be achieved through multiple replicas. After any one of them dies, it cannot be solved by a quick restart of this application. Instead, it should be solved by taking over the task immediately during the hang-up period. Both virtual machines and containers can have multiple replicas. In the case of multiple replicas, it is not so important to restart for one second or 20 seconds. What matters is what the program did during the period of hanging up. If the program is doing an unimportant operation, it doesn't matter if it hangs for 20 seconds. If the program is making a transaction and payment, it is not possible to hang for a second and must be able to repair it. Therefore, the high availability of applications should be solved by the retry and idempotency of the application layer, not by whether the infrastructure layer is restarted quickly.
For stateless services, with a retry mechanism in place, there is no problem with automatic restart repair, because stateless services do not save very important operations.
For stateful services, container restart is not only not recommended, but may also be the beginning of a disaster. A service has a state, such as a database, in a high concurrency scenario, once it is hung up, even for a second, we have to figure out what happened in this second, which data was saved and which data was lost, and cannot be restarted blindly, otherwise it is likely to cause data inconsistency and cannot be repaired later. For example, if the database under high-frequency trading is down, DBA should strictly review what data has been lost, instead of blindly rebooting without DBA's knowledge. DBA still thinks that nothing has happened, and it will take a long time to find the problem.
Therefore, the container is more suitable for deploying stateless services, and you can restart it at will.
It is not impossible for containers to deploy stateful containers, but they have to be very careful or even not recommended. Although many container platforms support stateful containers, the platform often cannot solve data problems unless you are very familiar with the applications in the container. When the container is dead, you can know exactly what is missing, which is important, and which is irrelevant. And you have to write code to handle these situations before you can support restart. In the case of master / slave synchronization of NetEase's database, it is by modifying the mysql source code to ensure complete data synchronization between master and standby that you dare to switch between master and slave automatically when the master is dead.
Promoting the automatic restart of stateful containers is very uneconomical for serving customers, because customers are often not so clear about the logic of applications, and even applications are purchased. If you use stateful containers and allow them to restart automatically, in the end, when customers find out that data is lost, it will still be your fault.
Therefore, the automatic restart of stateful services is not unavailable and requires sufficient expertise.
Myth 5: containers can use container platforms for service discovery
Both swarm and kubernetes,mesos support service discovery. When one service accesses another service, the service name is converted to VIP, and then the specific container is accessed.
However, people will find that for applications written based on Java, most of the calls between services will not be discovered by the service of the container platform, but by the service of Dubbo or spring cloud. Because the service discovery in the container platform layer is still relatively basic, it is basically a process of domain name mapping, and there is no good support for circuit breakers, current restrictions, and demotion. However, since service discovery is used, it is still hoped that the service discovery middleware can do this. Therefore, fewer service discoveries between services use the container platform, and the more concurrent applications are needed, the more so.
Does the service of the container platform find it useless? No, slowly you will find that the internal service discovery is on the one hand, these Dubbo and spring cloud can be handled, while the external service discovery is different, such as accessing the database, caching, etc., whether a database service name or IP address should be configured? If you use IP addresses, the configuration will be very complex, because many application configurations are complex because they rely on too many external applications, which is the most difficult aspect to manage. If there is an external service discovery, the configuration will be much easier, and only the name of the external service needs to be configured. If the external service address changes, the external service discovery can be changed flexibly.
Myth 6: the container can stretch based on the image.
On the container platform, if the container has the number of replicas, as long as the number of replicas is changed from 5 to 10, the container will stretch based on the image. In fact, virtual machines can also do this. AWS's Autoscaling is based on virtual machine images. If you are in the same cloud, there is no difference.
Of course, if the elastic scaling of the stateless container across the cloud, the container is much more convenient, and the hybrid cloud mode can be realized. In high concurrency scenarios, the stateless container can be expanded to the public cloud, which is impossible for the virtual machine.
Summary of misunderstandings in container understanding
As shown in the picture, the left side is often talked about the advantages of the so-called containers, but the virtual machines can all go back one by one.
If you deploy a traditional application, the application starts slowly, the number of processes is small, and basically does not update, then the virtual machine can fully meet the needs.
Application startup is slow: the application starts for 15 minutes, the container itself takes a second, and many virtual machine platforms can be optimized to more than ten seconds. There is almost no difference between the two.
Large memory footprint: 32G memory, 64G memory, a machine can not run a few.
Basically no update: once every six months, the virtual machine image can still be upgraded and rolled back.
Application status: stop the opportunity to lose data, and if you don't know what's missing, you can't recover it even if you start it in seconds, and it's possible that blind restart will lead to data confusion without repair due to data loss.
The number of processes is small: two or three processes configure each other without service discovery, and configuration is not troublesome.
If it is a traditional application, there is no need to spend money on containerization, because the effort is wasted and you can't enjoy the benefits.
Part II: containerization, micro-service, DevOps trinity
Under what circumstances should we consider making some changes?
The traditional business is suddenly impacted by the Internet business, the application is always changing, it needs to be updated every three days, and the traffic has increased. The original payment system is to withdraw money and swipe the card, but now it is necessary to pay on the Internet, and the traffic has expanded N times.
No way, one word: dismantle
Apart, each sub-module changes independently and has less influence on each other.
Taken apart, one process used to carry the traffic, but now multiple processes carry it together.
So it's called micro-service.
In the micro-service scenario, there are many processes and updates are fast, so there are 100 processes and one mirror every day.
The container is happy, the mirror image of each container is small, there is no problem, and the virtual machine cries because each image of the virtual machine is too big.
So in the micro-service scenario, you can start to consider using containers.
Virtual machine angry, I do not use containers, after the micro-service split, using Ansible automatic deployment is the same.
There is no problem from a technical point of view.
However, the problem arises from an organizational point of view.
In ordinary companies, there are many more developers than OPS. You don't have to worry about the development after writing the code. The deployment of the environment is entirely the responsibility of OPS. For automation, OPS writes Ansible scripts to solve problems.
However, there are so many processes, dismantled and merged, updated so quickly, the configuration is always changing, and the Ansible script has to be changed every day, so that the operation and maintenance staff must not be exhausted.
So with such a large workload, it is easy for operators to make mistakes, even through automated scripts.
At this point, the container can be used as a very good tool.
In addition to the fact that the container, from a technical point of view, enables most of the internal configuration to be placed in the image, more importantly, from the process point of view, the environment configuration has been pushed forward to the development here, requiring that after the development is completed, you need to consider the deployment of the environment, rather than being a shopkeeper.
The advantage of this is that although there are many processes, many configuration changes, and frequent updates, this amount is very small for the development team of a module, because 5-10 individuals specialize in maintaining the configuration and updates of the module and are not prone to errors.
If all this workload is handed over to a small number of operation and maintenance teams, not only the transmission of information will make the configuration of the environment inconsistent, but also the amount of deployment will be much larger.
Container is a very good tool, that is, if each developer does only 5% more work, it can save 200% of the work of operation and maintenance, and is not prone to error.
However, the original operation and maintenance should have done the development of things, the development of the boss is willing? Does the developer complain against the boss of operation and maintenance?
This is not a technical problem, in fact, DevOps,DevOps does not distinguish between development and operation and maintenance, but the company can get through from the organization to the process to see how to cooperate and how to demarcate the boundaries, which is more beneficial to the stability of the system.
So microservices, DevOps and containers are complementary and inseparable.
It is not a micro service, it does not need a container at all, a virtual machine can handle it, it does not need DevOps, it can be deployed once a year, and no matter how slow the communication between developers and OPS is.
Therefore, the essence of containers is cross-environment migration based on mirrors.
Mirroring is the fundamental invention of containers and the standard for encapsulation and operation. Other namespace,cgroup has existed for a long time. This is a technical aspect.
In terms of process, mirroring is a good tool for DevOps.
Containers are migrated across environments, and * migration scenarios are migration between development, test, and production environments. If you don't need to migrate, or if you don't migrate frequently, virtual machine images are fine, but you always have to migrate, with hundreds of gigabytes of virtual machine images, which is too big.
The second migration scenario is that the migration of virtual machines across clouds, public clouds, Region and two OpenStack is very troublesome, even impossible, because the public cloud does not provide downloading and uploading of virtual machine images, and the virtual machine images are so large that they are uploaded one day at a time.
Therefore, containers are also a good use scenario in cross-cloud scenarios and hybrid cloud scenarios. At the same time, it also solves the problem that only private cloud resources are insufficient to carry traffic.
Part III: the correct use of containers
Based on the above analysis, we find that containers are recommended for use in the following scenarios.
Deploy stateless services for complementary use with virtual machines for isolation
If you want to deploy stateful services, you need to know a lot about the applications in it.
As an important tool of continuous integration, it can migrate smoothly between development, testing and production.
Suitable for deployment and auto scaling of applications in cross-cloud, cross-Region, cross-data center, hybrid cloud scenarios
Use the container as the deliverable of the application, maintain environmental consistency and establish the concept of unchangeable infrastructure.
A program that runs the basic task types of a process
Used to manage changes, applications that change frequently use container images and version numbers, which are much more lightweight and convenient
Be sure to manage the application and design health check and fault tolerance when using the container.
At this point, I believe you have a deeper understanding of "what are the misunderstandings and scenarios of using containers?" you might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.