In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly explains "what is the history of cloud computing". Interested friends might as well take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "what the history of cloud computing is like".
What is Cloud Computing
As early as ten years ago, there were many jobs related to cloud computing in the market. At that time, cloud computing technology was the hottest era, and companies such as BAT and Huawei all began to layout cloud computing. Accordingly, there were more and more jobs related to OpenStack research and development, container research and development, and underlying development. Although in recent years, the limelight of big data and AI has completely overwhelmed cloud computing. However, this technology still occupies a very important position in today's technology system. So, what exactly is cloud computing is what every one of us who wants to learn cloud computing technology needs to know, according to the introduction of Baidu encyclopedia.
Big data (big data), IT industry term, refers to the data set that can not be captured, managed and processed with conventional software tools within a certain period of time. It is a massive, high growth rate and diversified information asset that requires a new processing model to have stronger decision-making power, insight and process optimization ability.
In the big data era, written by Victor Mayr-Schoenberg and Kenneth Kukeye, big data means that all data are analyzed and processed without shortcuts such as random analysis (sampling survey). Big data's 5V features (proposed by IBM): Volume (mass), Velocity (high speed), Variety (diversity), Value (low value density), Veracity (authenticity). [2]
The Development History of Cloud Computing physical computer era
The whole process of cloud computing is, in one word, "divided for a long time, divided for a long time."
In fact, cloud computing mainly solves four aspects: computing, network, storage and application. The first three are at the resource level, and finally at the application level.
Computing is CPU and memory, why? The simplest algorithm is to put 1 in memory, run the addition is done by CPU, and save the result 2 in memory.
The Internet means that you can access the Internet by plugging in the Internet cable.
Storage means you have room for your next movie. This discussion revolves around these four parts.
In primitive society, what people liked most was physical equipment:
The server uses physical machines, such as Dell, Hewlett-Packard, IBM, Lenovo and other physical servers. With the progress of hardware devices, physical servers are becoming more and more powerful. 64-core 128g memory is a common configuration.
The network uses hardware switches and routers, such as Cisco's and Huawei's, from 1GE to 10GE, now with 40GE and 100GE, and the bandwidth is getting better and better.
Some storage uses ordinary disks, but also has a faster SSD disk. The capacity ranges from M to G, and even laptops can be configured to T, not to mention disk arrays.
If the deployment application directly uses the physical machine, it looks very cool, there is always a feeling of Tuhao, but there are big disadvantages:
Manual operation and maintenance: what if you install software on a server and break the system installation? Only reloading. When you want to configure the parameters of the switch, you need to connect the serial port to configure it. When you want to add a disk, always buy one to plug into the server. All these need to be done manually, and it is likely to require a computer room. Your company is in the North Fifth Ring Road and the computer room is in the South sixth Ring Road, which is sour.
Waste of resources: you only want to deploy a small website, but you need 128 gigabytes of memory. If you mix it with deployment, there will be the problem of isolation.
Poor isolation: if you deploy a lot of applications on the same physical machine, they grab memory and cpu, one is full of hard disk, the other cannot be used, one fails the kernel and the other hangs at the same time. If you deploy two identical applications, the ports will conflict and errors will easily occur.
So there is the first process that must be divided for a long time, which is called virtualization. The so-called virtualization is to turn the real into the virtual.
The birth of virtual machine
The physical machine becomes a virtual machine: cpu is virtual, memory is virtual, kernel is virtual, and hard disk is virtual.
The physical switch becomes a virtual switch: the network card is virtual, the switch is virtual, and the bandwidth is virtual.
Physical storage becomes virtual storage: multiple hard drives are virtualized into a large block.
Virtualization solves the above three problems very well:
Manual operation and maintenance: the creation and deletion of the virtual machine can be operated remotely, the virtual machine is broken, delete and build a minute level. The configuration of the virtual network can also be operated remotely, creating network cards and allocating bandwidth can be done by calling the interface.
Waste of resources: after virtualization, resources can be allocated very small, such as 1 cpu,1G memory, 1m bandwidth, 1G hard disk, can be virtualized.
Poor isolation: each virtual machine has an independent cpu, memory, hard disk, network card, different virtual machine applications do not interfere with each other.
However, virtualization also has the following disadvantages. To create a virtual machine through virtualization software, you need to manually specify which machine to put, which storage device to put the hard disk on, the VLAN ID of the network, and the specific configuration of bandwidth. So operators who use virtualization alone often have an Excel table, how many machines there are, and which virtual machines are deployed on each machine. Therefore, the general number of virtualized clusters is not particularly large.
In the virtualization phase, the leader is Vmware, which can implement basic virtualization of computing, networking, and storage.
Of course, there are closed sources, open source, windows, linux, apple, andord, Vmware, Xen and KVM in this world. In terms of open source virtualization, Citrix did a good job in Xen, and later Redhat made a lot of efforts in KVM.
For network virtualization, there is Openvswitch, you can create bridges, network cards, set VLAN, and set bandwidth through commands.
For storage virtualization, for local sites, there is LVM, which can turn multiple hard drives into a large disk, and then cut out a small piece to the user.
In order to solve the remaining problems in the virtualization phase, there is a process of integration for a long time. This process can be vividly called pooling, that is to say, virtualization has divided resources into very fine-grained, but for such fine-grained resources to be managed by Excel, the cost is too high, can it be made into a large pool, when resources are needed, to help users automatically choose, rather than user-specified. So the key point at this stage: the scheduler Scheduler.
Public and private clouds
So vmware has its own vcloud.
So CloudStack, a private cloud platform based on Xen and KVM, was later acquired and open source by Citrix.
When these private cloud platforms sell extremely expensive and make a lot of money in users' data centers. There are other companies that have started another option, namely AWS and Google, and have begun to explore the public cloud.
AWS was initially virtualized based on Xen technology, and eventually formed a public cloud platform. Perhaps at first, AWS just did not want to give all the profits of its e-commerce field to private cloud vendors, so its own cloud platform first supported its own business. In this process, AWS seriously used its own cloud computing platform, making the public cloud platform not more friendly to the allocation of resources, but more friendly to the deployment of applications.
If we look closely, private clouds and public clouds use similar technologies, but they are completely different creatures in product design. Private cloud vendors and public cloud vendors have similar technologies, but show completely different genes in product operation.
Private cloud vendors sell resources, so selling private cloud platforms is often accompanied by selling computing, networking, and storage devices. In product design, private cloud vendors often emphasize long and detailed technical parameters of computing, network, and storage that customers can hardly use, because these parameters can be used to take advantage of the target process with competitors. Private cloud manufacturers almost do not have their own large-scale applications, so private cloud vendors' platforms are made for others, and they will not use them on a large scale, so the products tend to focus on resources and are not friendly to the deployment of applications.
Manufacturers of public clouds often have their own large-scale applications to deploy, so the design of their products can provide modules needed for common application deployment as components, and users can be like building blocks. Splicing an architecture suitable for their own applications. Public cloud vendors do not need to care about the competition of various technical parameters, whether open source, whether compatible with a variety of virtualization platforms, compatible with a variety of server devices, network devices, storage devices. You don't care what I use, as long as it is convenient for customers to deploy applications.
The birth of OpenStack
Of course, the first AWS of Public Cloud is very happy, while the second Rackspace is not. Yes, the Internet industry is basically a dominant company. How to counterattack the second place? Open source is a good way to let the whole industry work together for this cloud platform, brothers, let's go together. So Rackspace partnered with NASA to create the open source cloud platform OpenStack. OpenStack is now developing a bit like AWS, so you can see the pooling approach to cloud computing from the module composition of OpenStack.
What components does OpenStack contain?
The computing virtualization of the computing pooling module Nova:OpenStack mainly uses KVM, but it depends on nova-scheduler to turn on the virtual machine on which physical machine.
The network virtualization of network pooling module Neutron:OpenStack mainly uses Openvswitch. However, for the configuration of virtual network, virtual network card, VLAN and bandwidth of each Openvswitch, there is no need to log in to the cluster configuration. Neutron can be configured through SDN.
The storage virtualization of the storage pooling module Cinder:OpenStack is based on LVM if the local disk is used, and the disk allocated on which LVM is also used scheduler. Later, there is a way to make the hard drives of multiple machines into a pool Ceph, then the scheduling process is completed in the Ceph layer.
With OpenStack, all private cloud vendors are crazy. It turns out that VMware has made too much money in the private cloud market, and there is no corresponding platform to compete with him. Now that you have an off-the-shelf framework and your own hardware, you can imagine all the giants of IT manufacturers joining the community to develop OpenStack as their own products, together with hardware devices, into the private cloud market.
Of course, NetEase did not miss the tuyere and launched her own OpenStack cluster. NetEase Honeycomb independently developed IaaS services based on OpenStack. In terms of computing virtualization, through improvements such as tailoring KVM images and optimizing the startup process of virtual machines, the second level startup of virtual machines is realized. In the aspect of network virtualization, the high-performance exchange of visits between virtual machines is realized through SDN and Openvswitch technology. In terms of storage virtualization, high-performance cloud disks are achieved by optimizing Ceph storage.
But NetEase did not enter the private cloud market, but used OpenStack to support his own application, this is the thinking of the Internet, yes. Flexibility at the resource level alone is not enough, and components that are friendly to application deployment need to be developed. For example, database, load balancing, cache and so on, these are essential for application deployment, and they are also honed by NetEase in large-scale application practice. These components are called PaaS.
From IAAS to PAAS
I've been talking about the story of the IaaS layer, that is, infrastructure as a service, basically about computing, networking, and storage. Now it's time to talk about the application layer.
The definition of IaaS is relatively clear, but the definition of PaaS is not so clear. Some take database, load balancer and cache as PaaS services, some use big data Hadoop and Spark platform as PaaS services, and some talk about application installation and management, such as Puppet, Chef, and Ansible as PaaS services.
In fact, PaaS is mainly used to manage the application layer. I sum up two parts: one is that your own applications should be deployed automatically, such as Puppet, Chef, Ansible, Cloud Foundry, etc., which can be deployed for you through scripts, and part is that you don't need to deploy general applications that you think are complex, such as database, cache, big data platform, and you can get them on the cloud platform.
Either automatic deployment or no deployment, generally speaking, you don't have to worry about the application layer, which is the role of PaaS. Of course, it's best not to deploy at all, and you can get it with one click, so the public cloud platform turns all the common services into a PaaS platform. Some other applications are developed by you, and no one else knows about them but yourself, so you can use the tools to deploy automatically.
The biggest advantage of PaaS is that it can realize the elastic scaling of the application layer. For example, with the arrival of Singles Day, 10 nodes will become 100 nodes. If you use physical equipment, it is too late to buy 90 machines. It is not enough for IaaS to realize the flexibility of resources. Creating another 90 virtual machines is also empty, ah, it still requires the deployment of operation and maintenance personnel one by one. So it is good to have PaaS, after a virtual machine starts, run the automatic deployment script immediately to install the application, and 90 machines install the application automatically, which is the real elastic scaling.
Of course, there is also a problem with this kind of deployment, that is, no matter how well Puppet, Chef, Ansible abstract the installation script, it is based on the script in the final analysis, but the environment in which the application is located is very different, the file path is different, the file permission is different, the dependency package is different, the application environment is different, the software versions such as Tomcat, PHP, Apache and JDK,Python are different, and whether some system software is installed. Whether or not the ports are occupied may result in the unsuccessful execution of the script. So it seems that once the script is written, it can be copied quickly, but once the environment changes slightly, the script needs to be modified, tested, and co-tuned. For example, scripts written in the data center may not be directly available when transferred to AWS. If they are connected and tuned on AWS, there may be problems in migrating to Google Cloud.
The birth of containers
So the container arises at the historic moment. Container is Container,Container, which also means container. In fact, the idea of container is to become a container for software delivery. The characteristics of containers, one is packing, the other is standard. Imagine that in an era when there are no containers, if goods are transported from A to B, they have to go through three docks and change ships three times, each time the goods have to be unloaded and scattered, and then when they change ships, they need to be rearranged neatly, so when there are no containers, the crew can stay on shore for a few days before leaving. However, with the container, all the goods are packed together, and the size of the container is all the same, so each time the ship is changed, the whole box can be moved over, and the hour level can be completed. the crew can no longer go ashore for a long rest. So imagine that An is the programmer, B is the user, the goods are the code and the running environment, and the three docks in the middle are development, testing, and online.
Suppose the code runs in the following environment:
Ubuntu operating system
Create user hadoop
Download and extract JDK 1.7 in a directory
Add this directory to the environment variables of JAVA_HOME and PATH
Put the export of the environment variable in the .bashrc file under the home directory of the hadoop user
Download and extract tomcat 7
Put war under the webapp path of tomcat
Modify the startup parameters of tomcat and set the Heap Size of Java to 1024m
Look, a simple Java website needs to consider so many bits and pieces. If it is not packaged, it needs to check on every environment of development, testing and production to ensure the consistency of the environment, and even to re-build these environments, just like every time the goods are broken up and reloaded. There is a slight gap in the middle. For example, the development environment uses JDK 1.8, while online is JDK 1.7. For example, the use of root users in the development environment and the use of hadoop users online may lead to the failure of the program.
How does the container package the application? Or to learn containers, first of all, there must be a closed environment, the goods will be encapsulated, so that the goods do not interfere with each other, isolated from each other, so that loading and unloading is convenient. Fortunately, the lxc technology in ubuntu has been able to do this for a long time, and two technologies are mainly used here, one is the seemingly isolated technology, called namespace, that is, each application in namespace sees a different IP address, user space, process number, and so on. The other is isolated for use, called cgroup, which means that the whole machine has a lot of CPU, memory, and an application can only use part of it.
With these two technologies, we have welded the iron box of the container, and then it is time to decide what to put in it. The simplest and most rude way is to put all of the above list in the container. But this is too big, because the virtual machine image is like this, often tens of gigabytes, if you install a quiet ubuntu operating system, nothing installed, it is very big. This is actually equivalent to putting the ship in the container, and the answer is, of course, NO.
So leaving aside the first operating system, the rest adds up to a few hundred megabytes, which is much lighter. So the containers on a server share the operating system kernel, and containers migrate between different machines without a kernel, which is why many people claim that containers are lightweight virtual machines. Light is not light, the natural isolation is poor, a container pressure leakage of the ship, all the containers sink together.
Another thing that needs to be left behind is the data generated and saved locally as the application runs, mostly in the form of files, such as database files and text files. These files will become larger and larger as the application runs. If these data are also placed in the container, the container will become very large, affecting the migration of the container in different environments. Moreover, the migration of these data between development, testing, and online environments is meaningless, and it is impossible for the production environment to use the files of the test environment, so often these data are stored on storage devices outside the container. That's why people call containers stateless.
Now that the container is welded and the goods are loaded, the next step is how to standardize the container so that it can be transported on any ship. The standard here is the image and the operating environment of the container. The so-called mirror image is the moment you weld the container and save the state of the container. As Sun WuKong said, the container is fixed at that moment, and then the state of the moment is saved as a series of files. The format of these files is standard, and anyone who sees them can restore the fixed moment at that time. The process of restoring the image to the runtime is the process of reading the image file and restoring that moment, that is, the process of running the container. In addition to the famous Docker, other containers, such as AppC,Mesos Container, can run container images. So the container is not equal to Docker.
All in all, the container is lightweight, poor isolation, suitable for stateless, based on the mirror standard to achieve random migration across hosts and environments.
With containers, the PaaS layer makes the automatic deployment of users' own applications fast and elegant. The container is fast in two aspects. The first is to start the operating system when the virtual machine starts. The container does not need to start the operating system, because it is a shared kernel. The second is to install the application by script after the virtual machine is started, and the container does not need to install the application because it is already packaged in the image. So in the end, the startup of the virtual machine is at the minute level, while the startup of the container is at the second level. The container is so amazing. In fact, it is not magical at all, the first is to be lazy and do less work, and the second is to get the work done in advance.
Because containers start quickly, people often do not create small virtual machines to just deploy applications, because this is too time-consuming, but create a large virtual machine, and then divide the container in the large virtual machine. Different users do not share large virtual machines, so the operating system kernel can be isolated.
This is another process that must be divided for a long time. The virtual machine pool in the IaaS layer is divided into finer-grained container pools.
The granularity of the container is finer, more difficult to manage, and even difficult to deal with manually. Suppose you have 100 physical machines, in fact, the scale is not too large, manual management with Excel is no problem, but one open 10 virtual machines, the number of virtual machines is 1000, manual management is already very difficult, but a virtual machine inside 10 containers, that is, 10000 containers, you have completely given up the idea of manual operation and maintenance.
So the management platform at the container level is a new challenge, and the key word is automation:
Self-discovery: can the mutual configuration between containers and containers be like virtual machines, remember IP addresses, and then configure each other? With so many containers, how can you remember which configurations should be changed once a virtual machine is hung up and rebooted, and the list is at least ten thousand lines long? So the configuration between containers comes by name, no matter which machine the container runs to, the name remains the same, it can be accessed.
Self-repair: if the container is down, or the process is down, can you log in to check the process status like a virtual machine, and restart it if it doesn't work properly? You are going to land on ten thousand docker. So when the process of the container dies, the container automatically hangs up and then restarts automatically.
Self-scaling Auto Scaling: do you need to manually scale and deploy when the performance of the container is low? Of course, you have to come automatically.
If there is a container management platform, it will be closed for a long time.
Container management platform
There are three major schools of current hot container management platforms:
One is Kubernetes, which we call Duan Yu type. Kubernetes's father, Borg, was highly skilled, came from a royal family (Google) and managed a large Dali country (Borg is the container management platform of the Google data center). As a descendant of Dali Duan style, Duan Yu's martial arts gene is good (Kubernetes's concept design is relatively perfect), and the surrounding experts gather, and the martial arts environment is also good (Kubernetes is ecologically active and hot). Although Duan Yu's martial arts is not as good as his father, as long as he keeps learning with the experts around him, his martial arts can be improved rapidly.
One is Mesos, which we call Qiao Feng. The main kung fu of Mesos (Mesos's scheduling function) is martial arts, which is not found in other gangs. And Qiao Feng also managed a large number of gangs (Mesos managed Tweeter's container cluster). Later, Qiao Feng came out from the beggar gang and walked alone in the rivers and lakes (the founder of Mesos founded the company Mesosphere). Qiao Feng's advantage is that Qiao Feng's dragon eighteen palms (Mesos) are used in the gang, which is much more mature than when Duan Yu first learned his father's martial arts. But the disadvantage is that the eighteen hands of the dragon are only in the hands of a few gang masters (the Mesos community is still dominated by Mesosphere), and the other gang brothers can only worship Qiao Feng far away and cannot learn from each other (the community is not hot enough).
One is Swarm, which we call Murong type. The personal kung fu of the Murong family (Swarm is the cluster management software of the Docker family) is very good (Docker can be said to be the de facto standard of containers), but seeing that Duan Yu and Qiao Feng can manage the organization is getting larger and larger, there is a trend of integration, so they really want to create their own Murong Xianbei empire (launch Swarm container cluster management software). But good personal kung fu does not mean strong organizational ability (Swarm's cluster management ability). Fortunately, Murong family can learn from the organizational management experience of Duan Yu and Qiao Feng, learn from each company, and give it back to the other, so that Murong's organizational ability (Swarm draws lessons from a lot of previous cluster management ideas) is also gradually maturing.
What are the core technologies in cloud computing?
Author: Icelandic Community-Chen Hao
Link: https://www.zhihu.com/question/353443905/answer/877956605
Source: Zhihu
The copyright belongs to the author. Commercial reprint please contact the author for authorization, non-commercial reprint please indicate the source.
Cloud computing is a kind of intensive computing model with data and processing power as the center. It integrates many ICT technologies and is the product of the "smooth evolution" of traditional technologies. Among them, virtualization technology, distributed data storage technology, programming model, large-scale data management technology, distributed resource management, information security, cloud computing platform management technology, green energy-saving technology are the most important.
1. Virtualization technology
Virtualization is one of the most important core technologies of cloud computing. It provides infrastructure support for cloud computing services and is the most important driving force for ICT services to move rapidly towards cloud computing. It can be said that without virtualization technology, there will be no landing and success of cloud computing services. With the continuous heating up of cloud computing applications, the industry's attention to virtualization technology has also been raised to a new height. At the same time, our survey found that many people have a misunderstanding of cloud computing and virtualization, thinking that cloud computing is virtualization. In fact, this is not the case. Virtualization is an important part of cloud computing, but not all of it.
Technically speaking, virtualization is a form of computing that simulates computer hardware in software and provides services for users with virtual resources. The purpose of this paper is to allocate computer resources reasonably and make them provide services more efficiently. It breaks the physical division between the hardware of the application system, so as to realize the dynamic architecture and the centralized management and use of physical resources. The biggest advantage of virtualization is to enhance the flexibility and flexibility of the system, reduce costs, improve services, and improve resource utilization efficiency.
From the form of expression, virtualization is divided into two application modes. One is to virtualize a powerful server into several independent small servers to serve different users. The second is to virtualize multiple servers into a powerful server to complete specific functions. The core of these two modes is unified management, dynamic allocation of resources and improvement of resource utilization. In cloud computing, these two models have more applications.
2. Distributed data storage technology
Another advantage of cloud computing is that it can deal with large amounts of data quickly and efficiently. In today's data explosion, this is crucial. In order to ensure the high reliability of data, cloud computing usually uses distributed storage technology to store data in different physical devices. This mode not only gets rid of the limitations of hardware devices, but also has better scalability and can quickly respond to changes in user needs.
Distributed storage is not exactly the same as traditional network storage. Traditional network storage systems use centralized storage servers to store all data. Storage servers become the bottleneck of system performance and can not meet the needs of large-scale storage applications. The distributed network storage system adopts a scalable system structure, uses multiple storage servers to share the storage load, and uses the location server to locate and store information. it not only improves the reliability, availability and access efficiency of the system, but also is easy to expand.
In the current cloud computing field, Google's GFS and Hadoop's open source system HDFS are two popular cloud computing distributed storage systems.
GFS (GoogleFile System) technology: Google's non-open source GFS (GoogleFile System) cloud computing platform meets the needs of a large number of users and provides services to a large number of users in parallel. The data storage technology of cloud computing has the characteristics of high throughput and high transmission rate.
HDFS (Hadoop Distributed File System) technology: most ICT vendors, including Yahoo and Intel, adopt HDFS data storage technology in their "cloud" projects. The future development will focus on ultra-large-scale data storage, data encryption and security guarantee, as well as the continued improvement of the Icano rate and so on.
3. Programming mode
In essence, cloud computing is a multi-user, multi-task, concurrent processing system. Efficient, simple and fast is its core concept, which aims to easily distribute powerful server computing resources to end users through the network, while ensuring low cost and good user experience. In this process, the choice of programming mode is very important. Distributed parallel programming model will be widely used in cloud computing projects.
The original intention of the distributed parallel programming model is to make more efficient use of software and hardware resources, so that users can use applications or services more quickly and easily. In the distributed parallel programming mode, the complex task processing and resource scheduling in the background are transparent to users, so that the user experience can be greatly improved. MapReduce is one of the mainstream parallel programming modes in cloud computing. MapReduce mode automatically divides tasks into multiple sub-tasks, and realizes the height and distribution of tasks in large-scale computing nodes through two steps of Map and Reduce.
MapReduce is the java, Python and C++ programming model developed by Google, which is mainly used for parallel computing of large-scale data sets (larger than 1TB). The idea of MapReduce mode is to decompose the problem to be executed into Map (mapping) and Reduce (simplification). First, the data is cut into irrelevant blocks by Map program, which is assigned (scheduled) to a large number of computers for processing, so as to achieve the effect of distributed operation, and then the results are collected and output by Reduce program.
4. Large-scale data management
Dealing with huge amounts of data is a major advantage of cloud computing. Then how to deal with it involves many aspects, so efficient data processing technology is also one of the indispensable core technologies of cloud computing. For cloud computing, data management is facing great challenges. Cloud computing should not only ensure the storage and access of data, but also be able to retrieve and analyze massive data. As cloud computing needs to process and analyze massive distributed data, data management technology must be able to manage a large amount of data efficiently.
Google's BT (BigTable) data management technology and the open source data management module HBase developed by the Hadoop team are typical large-scale data management technologies in the industry.
BT (BigTable) data management technology: BigTable is a non-relational database, is a distributed, persistent storage of multi-dimensional sorting Map.BigTable based on GFS,Scheduler, Lock Service and MapReduce, different from the traditional relational database, it treats all data as objects to form a huge table for distributed storage of large-scale structured data. Bigtable is designed to handle PB-level data reliably and can be deployed to thousands of machines.
The open source data management module HBase:HBase is a sub-project of Apache's Hadoop project, which is located in a distributed, column-oriented open source database. Different from the general relational database, HBase is a database suitable for unstructured data storage. Another difference is that HBase is column-based rather than row-based. As a highly reliable distributed storage system, HBase has good performance in terms of performance and scalability. Large-scale structured storage clusters can be built on cheap PC Server by using HBase technology.
5. Distributed resource management
Cloud computing uses distributed storage technology to store data, so it is natural to introduce distributed resource management technology. In the multi-node concurrent execution environment, the state of each node needs to be synchronized, and when a single node fails, the system needs an effective mechanism to ensure that other nodes will not be affected. The distributed resource management system is such a technology, and it is the key to ensure the state of the system.
In addition, cloud computing systems often deal with huge resources, ranging from hundreds of servers to tens of thousands of servers, and may span multiple regions at the same time. And there are thousands of applications running in the cloud platform. How to manage these resources effectively and ensure that they provide services normally requires strong technical support. Therefore, the importance of distributed resource management technology can be imagined.
The major cloud computing solutions / service providers around the world are actively carrying out the research and development of related technologies. Among them, the Borg technology used in Google is highly praised by the industry. In addition, cloud computing giants such as Microsoft, IBM and Oracle/Sun all have corresponding solutions.
6. Information security
Survey data show that security has become one of the main reasons hindering the development of cloud computing. The data show that ICT management of 32% of organizations that have already used cloud computing and 45% of organizations that have not yet used cloud computing regard cloud security as the biggest obstacle to further cloud deployment. Therefore, in order to ensure the long-term stable and rapid development of cloud computing, security is the most important problem to be solved.
In fact, cloud computing security is not a new problem, the traditional Internet has the same problem. However, after the emergence of cloud computing, the security problem has become more prominent. In the cloud computing system, security involves many aspects, including network security, server security, software security, system security and so on. Therefore, some analysts believe that the development of the cloud security industry will bring the traditional security technology to a new stage.
Now, both software security manufacturers and hardware security manufacturers are actively developing cloud computing security products and programs. Security suppliers at all levels, including traditional antivirus software manufacturers, soft and hard firewalls, and IDS/IPS manufacturers, have joined the field of cloud security. It is believed that the problem of cloud security will be well solved in the near future.
7. Cloud computing platform management
Cloud computing resources are large in scale, with a large number of servers and distributed in different locations, and hundreds of applications are running at the same time. How to effectively manage these servers and ensure that the whole system provides uninterrupted services is a great challenge. The platform management technology of cloud computing system needs to have the ability to deploy a large number of server resources efficiently and make it work better together. Among them, the key to cloud computing platform management technology is to deploy and open new services conveniently, quickly find and recover system faults, and achieve reliable operation of large-scale systems by automatic and intelligent means.
For providers, there are three deployment models for cloud computing, namely, public cloud, private cloud, and hybrid cloud. The requirements of the three modes for platform management are quite different. For users, due to the different control of ICT resource sharing, system efficiency requirements and ICT cost budget, the scale and manageable performance of cloud computing systems required by enterprises are also very different. Therefore, the cloud computing platform management scheme should take more into account the customization needs, and can meet the application needs of different scenarios.
Many vendors, including Google, IBM, Microsoft, Oracle/Sun and so on, have cloud computing platform management solutions. These solutions can help enterprises achieve infrastructure integration, unified management, unified distribution, unified deployment, unified monitoring and unified backup of enterprise hardware and software resources, and break the monopoly of applications on resources. so that the value of enterprise cloud computing platform can be brought into full play.
At this point, I believe you have a deeper understanding of "the history of cloud computing". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.