In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article focuses on "hierarchical case analysis of the architecture of standard Web systems". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Next let the editor to take you to learn the "standard Web system architecture hierarchical case analysis" bar!
Wall crack sharing, architecture layering of standard Web system, look at it.
In the figure above, we describe the components of the Web system architecture. And the common technical components / service implementations of each layer are given. You should pay attention to the following points:
In fact, the concept of load balancing is very extensive, and the process described is that the external processing pressure is distributed to the internal processing nodes through some law / means in the future. In our daily life, we deal with load technology anytime and anywhere, such as the traffic guidance during rush hour, the air flow control of CAAC and the dialing system of bank counter.
The load distribution layer we are talking about here refers to the narrow load balancing on the computer system realized by software. A large (daily PV100 million +), medium (daily PV10 million +) Web business system, it is impossible to have only one business processing service, but multiple servers to carry out a service of the same business at the same time. So we need to design an architecture according to the business form to share the business requests from external clients to every available business node. As shown in the following figure:
Wall crack sharing, architecture layering of standard Web system, look at it.
The load layer also has the function of dispatching different request types to different servers according to the user's request rules. For example, if a HTTP request is a request for a picture, then the load layer will go directly to the image storage medium to look for the corresponding picture; if a HTTP request is a submitted order, then the load layer will submit the order to the specified "order service" node according to the rules.
Different business requirements, the use of load layer solutions are different, which tests the architect's ability to choose solutions. For example, Nginx can only deal with the application layer HTTP protocol above the TCP/IP protocol. If you want to deal with the TCP/IP protocol, you have to follow the TCP-Proxy-Module module of the third party. A better solution to load directly at the TCP/IP layer is to use HAProxy.
Common load layer architectures include:
-stand-alone Nginx load or HAProxy scheme
-LVS (DR) + Nginx scheme
-DNS polling + LVS + Nginx scheme
-Intelligent DNS (DNS routing) + LVS + Nginx scheme
These load architecture schemes and their variations will be described in detail in subsequent articles.
Overview
Popularly speaking, it is our core business layer, order business, construction management business, diagnosis and treatment business, payment business, log business and so on. As shown in the following figure:
Wall crack sharing, architecture layering of standard Web system, look at it.
It is obvious that in large and medium-sized systems, these services cannot exist independently, and general design requirements will involve decoupling between subsystems: that is, X1 system does not need to know the existence of its logically equivalent X2 system in addition to knowing the existence of the underlying supporting system (such as user rights system). In this case, in order to complete a more complex business, inter-subsystem calls are essential: for example, after A business is processed successfully, B business will be called for execution; A business will call C business for execution after processing failure; or A business and D business are an inseparable whole in some cases, and only succeed at the same time, in which there is a failure that the whole business process fails. As shown in the following figure:
Wall crack sharing, architecture layering of standard Web system, look at it.
In this way, the communication layer between services is an inescapable topic. In the following article, we will explain the business communication layer technology, especially the points for attention in the technology selection of the business communication layer, with the principle and use of Alibaba's Dubbo framework, message queuing based on AMQP protocol and Kafka message queuing technology.
Wall crack sharing, architecture layering of standard Web system, look at it.
The HTTP request method that has to be mentioned
Some readers may ask why the communication layer between business systems does not mention invocation methods like HTTP. After all, many companies currently use this method as a way of calling between business systems. Let's first look at the invocation process in HTTP mode through a diagram. (note that this process does not take into account the http client cache process or the DNS domain name resolution process, starting with HTTP establishing a reliable TCP connection):
Wall crack sharing, architecture layering of standard Web system, look at it.
From the above picture, we can see the following problems:
Based on the above description, HTTP is not recommended as a way of communication / invocation between businesses, but HTTP is recommended only for clients such as WEB, iOS, Android, etc., to request services.
Data storage will be another focus in this series of articles. The initial data before the business calculation, the temporary data in the calculation process, and the calculation results after the completion of the calculation all need to be stored. Through a mind map, we first explain the basic classification of data storage from several dimensions.
Wall crack sharing, architecture layering of standard Web system, look at it.
File storage principle
We explain the most basic principles of a file system through a basic process of creating an Ext4 file system on a Centos6.5 system.
First of all, we will partition the local hard disk with the fdisk command (that is, determine the range of sectors that can be controlled), as shown in the following figure:
Wall crack sharing, architecture layering of standard Web system, look at it.
Then we will create the desired file system (Ext3, Ext4, LVM, XF, BTRFS, etc.) on this area with the mkfs command, as shown in the following figure:
Wall crack sharing, architecture layering of standard Web system, look at it.
Finally, we mount the file system to the specified path, as shown in the following figure:
Wall crack sharing, architecture layering of standard Web system, look at it.
View the mount information through the df command, as shown in the following figure:
Wall crack sharing, architecture layering of standard Web system, look at it.
What fact does the ever-changing process of creation tell us?
Wall crack sharing, architecture layering of standard Web system, look at it.
Physical block, a physical block is the smallest unit that our upper file system can operate (usually 512 bytes), and a physical block corresponds to multiple physical sectors at the bottom. Usually a SATA hard disk will have several robotic arms (depending on the number of physical disks) and several physical sectors (the size of the physical sector is determined when the disk is shipped from the factory and we cannot change it).
If the work of a single sector is unidirectional, then the mapped physical block works in a unidirectional way. The principle is that when the mechanical arm reads the data in this sector, the hardware chip does not allow the mechanical arm to write data to this sector at the same time.
Through the encapsulation of the lower physical blocks by the upper file system (EXT, NTFS, BTRFS, XF), OS does not need to directly operate the disk physical blocks, and the operators do not need to care about the storage format of these files in the physical blocks when they see a file through a command such as ls. This is why different file systems have different characteristics (some file systems support snapshots, some file systems support data recovery), the basic principle is that these file systems have different operating specifications for underlying physical blocks.
Block storage and file storage
In the previous section, we described how the simplest and most primitive physical block and file format specifications work, but with the expanding data storage capacity and data security requirements on the server side, it is obvious that stand-alone storage cannot meet the requirements. The two major types of requirements for the current storage environment are:
Wall crack sharing, architecture layering of standard Web system, look at it.
Stable expansion of storage capacity, and does not destroy the current stored data information, does not affect the stability of the entire storage system.
File sharing enables multiple servers to share and store data, and can read and write to the file system.
To solve these two problems, we first extend the problem to the legend in the previous section, as shown in the following figure:
It is obvious that the answer to the two questions in the figure is yes, which is the problem to be solved by the block storage system we are going to introduce.
Block storage system
Let's talk about block storage first. The simplest case we mentioned before is that the disk is on the local physical machine, and the transmission of the physical block Icano command is also carried out through the south bridge on the motherboard of the local physical machine. However, in order to expand more disk space and ensure data throughput, we need to separate the disk media from the local physical machine and allow the physical block Imax O command to be transmitted over the network:
Wall crack sharing, architecture layering of standard Web system, look at it.
Although the disk media and the local physical machine have been separated, the nature of the direct transfer block Iswap O command has not changed. The local south bridge transmits the Iamp O command into optical fiber transmission, only the local physical machine transmits it into a network transmission, and it is regulated by some communication protocol (such as FC, SCSI, etc.).
File system mapping is done locally, not remotely. As mentioned above, due to the sequence of block operations (when a sector is written, the read operation of this sector will not take place), and the block operation belongs to the underlying physical operation and cannot actively feedback changes to the upper file logic layer. Therefore, multiple physical hosts cannot share files through this technology.
Block storage system needs to solve the coexistence problems of large physical storage space, high data throughput and strong stability. As the upper-layer server that uses this file system, it is very clear that no other server can read and write to these physical blocks that belong to it. In other words, it believes that this huge capacity of file storage space is only the storage space on its local physical machine.
Of course, with the development of technology, there are some technologies that can only use TCP/IP protocol to transmit standard SCSI commands, in order to reduce the construction cost of this block storage system (such as iSCSI technology). But this compromise is also at the cost of weakening the data throughput of the whole system. Different business requirements can be selected according to the actual situation.
File storage system
What if the file system is migrated from the local physical machine to the remote through the network? Of course, typical file storage systems include FTP, NFS, and DAS:
Wall crack sharing, architecture layering of standard Web system, look at it.
The key to a file storage system is that the file system is not native. Instead, the remote file system is accessed through the network, and the data operation is completed by the remote file system operation block Imax O command.
Generally speaking, the local file system NTFS/EXT/LVM/XF does not allow direct network access, so the general file storage system will be encapsulated by a network protocol, which is the NFS protocol / FTP protocol / NAS protocol (note that we are talking about the protocol), and then the protocol operates the server file system of the file storage system.
The first problem to be solved in the file storage system is file sharing, and the network file protocol can ensure that multiple clients share the file structure on the server. From the entire architecture diagram, you can see that the data read and write speed and data throughput of the file storage system cannot be compared with the block storage system (because this is not the primary problem to be solved by the file storage system).
From the above introduction, we can clearly know that when facing a large number of data reading and writing pressure, the file storage system is certainly not our first choice, and when we need to choose a block storage system, we are faced with the double pressure of cost and operation and maintenance (the construction of SAN system is more complex, and the equipment is expensive). And in the actual production environment, we often encounter scenarios where data reading pressure is high and file information needs to be shared. So how to solve this problem?
Object storage system
Object storage with both the high throughput and high stability of block storage system and the network sharing and low cost of file storage is to meet such needs. Typical object storage systems include: MFS, Swift, Ceph, Ozone and so on. Below we briefly introduce the characteristics of the object storage system, in the following article, we will choose an object storage system to explain in detail.
Object storage system must be a distributed file system. But the distributed file system is not necessarily an object storage system.
Wall crack sharing, architecture layering of standard Web system, look at it.
We know that file information is described by several attributes, including file name, storage location, file size, current status, number of copies, and so on. We extract these attributes and specifically use the server for storage (metadata server). In this way, in order to access a file, the client of the file operation will first ask the metadata node for the basic information of the file.
Because it is a distributed system, then data consistency, resource contention and node anomaly problems all need to be coordinated uniformly. Therefore, there are generally monitoring / coordinating nodes in the object storage system. The number of metadata nodes and monitoring / coordination nodes supported by different object storage systems is different. But the general trend is "decentralization".
OSD nodes (object-based storage devices) are used to store file content information. It should be noted here that although the bottom layer of the OSD node and the block storage bottom layer both rely on the block Imax O for operation, the upper structure is completely different: the OSD node does not skip the local file system and directly operate the physical block like the block storage device.
In the following article, we will choose a popular object storage system, analyze the object storage system in detail, and explain the three core concepts and trade-offs in the distributed storage system (CAP): consistency, scalability and fault tolerance.
Database storage
This article has written a lot of summary descriptions of the storage layer, so an overview of database storage technologies that we are familiar with or are not familiar with will not be introduced here.
In subsequent articles, I will use Mysql to explain several common architectural solutions and performance optimization points, as well as how core data engines such as Innodb work in Mysql. These architecture solutions mainly solve the core problems of Mysql, such as stand-alone Istroke O bottleneck, data disaster recovery in computer room, database stability, data disaster recovery across computer rooms and so on.
In the following article, I will also select the current popular data caching system to explain its working principle, core algorithm and architecture scheme. So that readers can design storage clusters according to their own business conditions. Of course, there is also an in-depth introduction to non-relational databases Cassandra, HBase, and MongoDB.
How do we evaluate whether the top-level design of a service system is excellent? Put aside the stereotyped expansibility, stability, robustness and security. I have summed up several key points of evaluation for you from the actual work.
Construction cost
Any system architecture needs to pay the construction cost when it is implemented in the production environment. Obviously, different companies / organizations have different acceptance of costs (these costs include design costs, asset procurement costs, operation and maintenance costs, and third-party service costs). Therefore, how to use the limited cost to build a system that meets the business needs and adapts to the scale of access is a complex problem. In addition, the architect cannot overdesign under this requirement.
Expansion / planning level
According to the development of the business, the whole system needs to be upgraded (including upgrading the functions of existing modules, merging existing modules, adding new business modules or improving data throughput while the functions of the modules remain unchanged). Then how not to affect the work of the original business as far as possible, with the fastest speed and minimum workload to carry out the horizontal and vertical expansion of the system, is also a complex problem. A good system architecture can be upgraded without any feeling from the user, or only needs to be temporarily out of service when some key subsystems are upgraded.
Anti-attack level
The attack on the system must be aimed at the weakest link of the whole system, and the attack may come from outside (such as Dos/DDos attack) or from inside (password intrusion). A well-structured system is not a "absolutely unbreakable" system, but a "well-prevented" system. The so-called prevention is to prevent possible attacks and simulate all kinds of attacks in stages; the so-called hiding is to use various means to manage the key information of the whole system, such as ROOT authority, physical location, firewall parameters, user identity.
Disaster recovery level
A good architecture should consider different levels of disaster recovery. In the case of cluster disaster recovery, if one of the service nodes in the cluster crashes, another host in the cluster can take over his work immediately, and the failed node can disengage. Distributed disaster recovery: the distributed system generally assumes that single point failure / multipoint failure occurs in the whole system at any time. When single point failure / multipoint failure occurs, the whole distributed system can provide services normally. And the single point failure / multipoint failure areas in the distributed system can be restored automatically / manually, and the distributed system will re-accept them. Remote disaster recovery (computer room level disaster recovery): in the case of a physical disaster in the computer room (physical network disruption, war destruction, earthquake, etc.), the backup system can find such a disaster in a distant place. And take the initiative to take over the system operation rights, notify the system operation and maintenance personnel (depending on the system operation requirements, there may be multiple backup systems). The biggest challenge of remote disaster recovery is how to ensure the integrity of remote data.
Business adaptability level
In the final analysis, the system architecture is for business services, and the design and selection of the system architecture must be based on the premise of serving the current business. In the business communication layer mentioned above, choosing a SOA component or a message queue component, or what kind of message queue you choose, is a good business-driven event. For example, business An is a kind of WEB front-end service, which needs to feedback the operation results to customers in time, and the service pressure of business B is very great. When A service invokes B service, B service can not return the result of A service call in millisecond. An AMQP-type message queuing service can be used in this business scenario. In addition, there are two points: at present, there are many different solutions in the industry to solve the same business scenario. In the process of solution selection, architects must have a good grasp of the characteristics of various solutions in order to make the right choice; in addition, there are enough solutions in the industry, and architects must not "reinvent the wheel" when there are no special requirements for the business.
Degree of difficulty of maintenance
A set of service system needs the continuous input of the operation and maintenance team from the beginning of its establishment. Obviously, according to the complexity of the system and the number of physical machines, the knowledge complexity of the operation and maintenance team is also different. When architects design the top-level architecture, they must also consider the difficulty and cost of operation and maintenance of the system.
The detailed architecture scheme of load layer, business layer, business communication layer and data storage layer will be explained in depth with a number of articles in the following articles, including core algorithm, erection principle, erection case.
At this point, I believe you have a deeper understanding of the "hierarchical case analysis of the architecture of the standard Web system". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.