Evolution History of Micro Service Architecture based on Spring Cloud 05/01 Update SLTechnology News&Howtos

Evolution History of Micro Service Architecture based on Spring Cloud

2025-05-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

Guide reading

"micro-service architecture" has been a hot word for some time, and there are many discussions and topics about micro-service architecture in various technical official accounts or architecture sharing meetings. For most start-up Internet companies, the early single application structure is the most appropriate choice, only when the business enters a period of rapid development and the system pressure, business complexity and the speed of personnel expansion are all rising rapidly. how to upgrade the entire Internet software system to a micro-service architecture in a fast, safe and orderly manner to meet the needs of business development and the reshaping of technical organizations Is the main driving force for the implementation of micro-service architecture, otherwise there is no point in talking about micro-service architecture.

Once it is decided to upgrade the entire application system according to the micro-service architecture, it needs to upgrade the business system, infrastructure, operation and maintenance system and other aspects in an organized and planned way. Another awkward reality is that when general business development enters the level of micro-service architecture, business development is often very rapid. The pressure of rapid business development and growth often brings great challenges to the entire technical team, because at this time you need to choose, is it a simple solution to quickly support it? Or choose an appropriate long-term plan? Of course, most of this situation is a matter of technical details, and most of the "degree" of control is in the hands of specific engineers.

How to ensure the rapid and orderly leap from the application system and organizational structure to the micro-service era as a whole is a great test of team ability and architecture management level. Being able to achieve 80 points is already excellent, because it has its objective law!

The author has personally experienced the whole process of a fast-growing Internet company from a single application to a micro-service architecture based on Spring Cloud technology stack. This article will mainly discuss with you how to split the micro-service architecture using Spring Cloud from a technical point of view, as well as some of your own thoughts in the process. The level is limited, please forgive the inadequacies!

Overview of system Architecture Evolution

In the start-up period of the company's business, the main problem is how to turn an idea into an actual software implementation. at this time, the architecture of the whole software system is not so complicated. in order to iterate quickly, the whole software system is composed of "App+ background service", which only splits the application into Jar packages from an engineering point of view. The software system architecture is as follows:

At this time, the function of the whole software system is also relatively simple, only the basic user, order, payment and other functions, and because the business process is not so complex, these functions are basically coupled together. With the popularity of App (the author's company happens to be in an Internet hot spot), App downloads soared in 2017, as did online registrations.

With the rapid growth of traffic, the pressure of the whole background service becomes very great. In order to resist the pressure, we can only add machines and expand the background service nodes in parallel. The deployment architecture at this time is as follows:

In this way, the whole software system withstands a wave of pressure, but the system often has occasional accidents, especially because some interface performance problem in API makes the whole service unavailable, because these interfaces are all in a JVM process, although multiple nodes are deployed at this time, but because the underlying database and cache system are all in one set, there will still be a hang-up.

On the other hand, with the rapid development of business, the relatively simple functions in the past have become more complex. These functions are not only visible to users, but also invisible to many users, just like Baidu search. Users may only see a search box, but in fact, the corresponding services in the background may be hundreds of thousands, such as some growth strategy-related functions: red packets, sharing customer acquisition, and so on. There are also some cash functions related to advertising recommendations, and so on.

In addition, the growth of traffic / business also means the rapid growth of the number of teams. If people still use a set of service codes to develop their respective business functions, it is hard to imagine a hundred or so people. What would it be like to overlay functions in the same project? So how to divide the business boundary, reasonable team configuration is also a very urgent thing!

In order to solve the above problems and adapt to the business and team development, the architecture team decided to split the micro-services. In order to implement the micro-service architecture, we need not only a reasonable boundary demarcation of business modules, but also a complete set of technical solutions.

In the choice of technical solutions, there are many frameworks for service split governance, such as WebService in the early days, and various Rpc frameworks (such as Dubbo, Thirft, Grpc) in the near future. Spring Cloud is based on a complete set of micro-service solutions provided by Spring Boot. Because the technology stack is relatively new and the support of various components is very comprehensive, Spring Cloud has become the first choice.

After a series of refactoring and expansion, the whole system architecture finally forms a set of micro-service software system with APP as the center, with the following structure:

At this point, the whole software system has preliminarily completed the split of the micro-service system based on Spring Cloud. The core functions such as payment, order, user and advertisement are separated into independent micro-services. at the same time, the corresponding databases of their respective micro-services are also split according to the service boundary.

After the split of the service, the code invocation relationship between the original functional logic is transformed into the network invocation relationship between services, and each micro-service needs to provide corresponding services according to the functions it carries. At this time, how the service is discovered and invoked by other services becomes a key part of the whole micro-service system. Students who have used the Dubbo framework know that. The registration-discovery of services in Dubbo relies on ZooKeeper, while in Spring Cloud we do it through Consul. In addition, in the Spring Cloud-based architecture, a configuration center (ConfigServer) is provided to help each micro-service manage configuration files, while the original API service, with the separation of various functions, has gradually evolved into a front-end gateway service.

At this point, we split the micro-services based on Spring Cloud, and in this architecture, we mention several key components such as Consul, ConfigServer and gateway services, so how do these key components support this huge service system?

Key components of Spring Cloud

Consul

Consul is an open source registry service developed in the GE language. It has built-in service discovery and registration framework, distributed consistency protocol implementation, health check, Key/Value storage, multiple data centers and other solutions. Eurke can also be selected as the registry in the Spring Cloud framework. The main reason for choosing Consul here is Consul's support for heterogeneous services, such as gRPC services.

In fact, in the subsequent evolution of the system architecture, in the process of further splitting some service modules to sub-systematization, gRPC is used as the invocation mode between subsystem services. For example, with the continued expansion of the payment module, the micro-service architecture of the payment service itself is split, at this time, the payment micro-service is invoked by gRPC, while the service registration and discovery itself still rely on the same set of Consul clusters.

At this time, the system architecture evolves as follows:

After the module service in the original micro-service architecture reaches a certain degree of scale or complexity, it will develop towards an independent system, which makes the call link of the whole micro-service very long, and from the Consul point of view, all services are flat.

With the increasing scale of micro-services, Consul, as the core service component of the whole system, is in a key position in the whole system. Once the Consul is dead, all services will stop serving. So what kind of service is Consul? How to design the disaster recovery mechanism?

To ensure the high availability of Consul services, there is no doubt that Consul should be a cluster in the production environment (refer to the network materials for the installation and configuration of Consul clusters). In the Consul cluster, there are two roles: Server and Client. These two roles have nothing to do with the application services running on the Consul cluster, but are based on a role division at the Consul level. In fact, it is the Server node that maintains the state information of the entire Consul cluster. Just like using ZooKeeper to implement the registry in Dubbo, each Server node in the Consul cluster also needs to be elected (using the GOSSIP protocol and Raft consistency algorithm, which is not expanded in detail here, and can be discussed separately in later articles) to elect the Leader node in the entire cluster to handle all queries and transactions. And synchronize the status information to other nodes.

On the other hand, the Client role is relatively stateless and simply forwards RPC requests to Server nodes. The reason why Client nodes exist is to share the pressure of Server nodes and act as a buffer. This is mainly because the number of Server nodes should not be too large, because the more Server nodes, the slower the process of reaching consensus, and the higher the cost of synchronization between nodes. For Server nodes, it is generally recommended to have 3-5 nodes, while there is no limit on the number of Client nodes. Thousands or tens of thousands of nodes can be deployed according to the actual situation. In fact, this is only a strategy. In the real production environment, most applications only need to set up 3 or 5 Server nodes. The Consul cluster in the production cluster of the author's company has 5 Server nodes instead of additional Client nodes.

In addition, another concept in the Consul cluster is Agent. In fact, every Server or Client is a consul agent, which is a daemon running on each member of the Consul cluster. Its main role is to run the DNS or HTTP interface, and is responsible for runtime checking and keeping service information synchronized. When we start the nodes (Server or Client) of the Consul cluster, we start them through Consul Agent. For example:

Consul agent-server-bootstrap-syslog\-ui\-data-dir=/opt/consul/data\-dns-port=53-recursor=10.211.55.3-config-dir=/opt/consul/conf\-pid-file=/opt/consul/run/consul.pid\-client=10.211.55.4\-bind=10.211.55.4\-node=consul-server01\-disable-host-node-id &

Taking the actual production environment as an example, the deployment structure of the Consul cluster is as follows:

In the actual production case, the Client node is not set up, but a cluster composed of five Consul Server nodes is used to serve the application registration & discovery of the whole production cluster. Here are some details to know. In fact, the IP addresses of the five Consul Server nodes are different. The specific service should connect to the IP of the Leader node when connecting to the Consul cluster for service registration and query. The question is, if the Leader node dies, the corresponding application service node, how to connect the new Leader node elected by Raft? Isn't it possible to switch IP manually?

Obviously, the way of manually switching IP is not reliable, but in production practice, each node of the Consul cluster actually runs DNS on the Consul Agent (such as the red font in the startup parameters). When the application service connects to the Consul cluster, the IP,DNS with the IP address of DNS will map the address resolution to the IP corresponding to the Leader node. If the Leader node dies, the elected new Leader node will notify the DNS service of its own IP. DNS updates the mapping relationship, which is transparent to each application service.

Through the above analysis, Consul ensures the stability and high availability of Consul services through cluster design, Raft election algorithm, Gossip protocol and other mechanisms. If you need a higher level of disaster recovery, you can also build two Consul data centers to form a remote disaster recovery Consul service cluster by designing dual data centers, but the cost will be higher, depending on whether it is really needed.

ConfigServer (configuration Center)

The configuration center is a service that manages the configuration of micro-service applications, such as the configuration of databases, the configuration of some external interface addresses, and so on. ConfigServer is an independent service component in Spring Cloud. Like Consul, it is also a key component in the whole micro-service architecture. All micro-service applications need to call their services in order to obtain the configuration information needed by the application.

With the expansion of the scale of micro-service applications, the access pressure on the entire ConfigServer node will gradually increase. At the same time, there will be more and more configurations of various micro-services. How to manage these configuration files and their update strategies (to ensure that there is no risk of online failure caused by random changes in production configuration), and how to build highly available ConfigServer clusters is also a very important aspect to ensure the stability of micro-service system.

In production practice, because key components such as Consul and ConfigServer need to be independently clustered and deployed in physical machines rather than containers. When we introduced Consul in the previous section, we built 5 Consul Server nodes independently. On the other hand, ConfigServer is mainly a http profile access service, which does not involve operations such as node election and consistency synchronization, so it still builds a high-availability configuration center in the traditional way. The diagram of the specific structure is as follows:

We can manage the application configuration files through Git alone. Normally, the ConfigSeever can pull the configuration of the Git repository directly through the network for the service to obtain, so that as long as the configuration of the Git repository is updated, the configuration center can immediately perceive it. But the instability of this is that Git itself is a code management tool for intranet development. If you let the online real-time service read directly, it is easy to pull down the Git repository. Therefore, in the actual process of operation and maintenance, we version control the configuration files through Git to distinguish between the online branch / master and the functional development branch / feature. And after the completion of the mr, you also need to manually (triggered by the release platform) to synchronize the configuration of the new master branch to the local path of the host where each ConfigServer node is located, so that the ConfigServer service node can obtain the configuration file through its local directory without having to call the network many times to get the configuration file.

On the other hand, as there are more and more micro-services, there will be more and more configuration files in the Git repository. In order to facilitate the management of configuration, we need to organize the configuration of different application types according to a certain organization. In the early days, all applications were not classified, so hundreds of configuration files for micro-services were placed in a warehouse directory, which increased the cost of managing configuration files. On the other hand, it also affected the performance of ConfigServer, because configurations that were not needed by a micro-service would also be loaded by ConfigServer.

So the later practice is to organize according to the hierarchical relationship of configuration, abstract the global project configuration of the company to the top level, which is loaded by ConfigServer by default, while all other micro-services are grouped by application type (grouped by Git project space), the same applications are placed in a group, and then a separate Git warehouse named Config is set up under this group to store the configuration files of the related micro-services under this group. The hierarchy is as follows:

In this way, the priority of applying the load configuration is in the order of "Local configuration-> common configuration-> Group Common configuration-> Project configuration". For example, for a service A, parameter An is configured in the default configuration file ("bootstrap.yml/application.yml") of the project project, and parameter B is also configured in the local project configuration "application-production.yml". At the same time, parameter C, parameter D and a group called "pay" exist in the configuration file "application.yml/application-production.yml" under the common warehouse in ConfigServer. The default configuration file "application.yml/application-production.yml" has parameters E and F, and the specific project pay-api has a configuration file "pay-api-production.yml", which covers the values of parameter C and parameter D in common warehouse. So at this time, if the application starts in the way of "spring.profiles.active=production", then the configuration parameters it can get (through the link: http://{spring.cloud.config.uri}/pay-api-production.yml) are A, B, C, D, E, F, where the parameter values of C and D are the last overridden values in pay-api-production.yml.

As for the ConfigServer service itself, the configuration type matching needs to be organized in this way. For example, in the above example, it is assumed that there is still a configuration repository for finance, while the service access configuration center under the pay group does not need the configuration file in the finance space, so ConfigServer can not be loaded. Here you need to do some configuration in the ConfigServer service configuration. The details are as follows:

Spring: application: name: @ project.artifactId@ version: @ project.version@ build: @ buildNumber@ branch: @ scmBranch@ cloud: inetutils:-docker0 config: server: health.enabled: false git: uri: / opt/repos/config searchPaths: 'common {application} 'cloneOnStart: true repos: pay: pattern: pay-* cloneOnStart: true uri: / opt/repos/example/config searchPaths:' common {application} 'finance: pattern: finance-* cloneOnStart: true uri: / opt/repos/finance/config searchPaths:' common, {application}'

This is achieved by setting its configuration search method in the application.yml local configuration of the ConfigServer service itself.

Gateway service & service breaker & monitoring

Through the contents of the above two sections, we introduce in relative detail two key service components in the Spring Cloud-based architecture. However, in the micro-service architecture, there are still many key problems to be solved. For example, if the application service deploys multiple nodes in the Consul, how can the caller achieve load balancing?

This problem is implemented through Nginx in the traditional architecture scheme, but in the previous introduction of Consul, we only mentioned the service registration-discovery, election and other mechanisms of Consul, but did not mention how Consul implements the load balance of service invocation. Is it possible that the application services in the Spring Cloud-based micro-service architecture are all provided by a single node, even if multiple service nodes are deployed? In fact, when the service consumer starts the invocation through the @ EnableFeignClients annotation and invokes the service through the @ FeignClient ("user") annotation, we have already achieved load balancing. why? This is because the Robbin proxy is enabled by default, and Robbin is a component that implements client-side load balancing by pulling service node information from Consul and forwarding client call requests to different server nodes in a polling manner. All of this is done through code within the process on the consumer side. This load mode is hosted on the consumer application service, and has a certain code intrusiveness to the consumer end, which is one of the reasons why the concept of Service Mesh (Service Grid) appears later. It will not be carried out here, and there will be an opportunity to communicate with you again later.

Another key problem that needs to be solved is the implementation of service breaker, current limiting and other mechanisms. Spring Cloud supports this mechanism by integrating Netflix's Hystrix framework, which is also implemented on the consumer side like the load balancing mechanism. Due to the lack of space, it will not be carried out here, and I will have the opportunity to communicate with you again in the following article.

In addition, there are Zuul components to implement API gateway services, which provide routing distribution and filtering-related functions. Other auxiliary components include Sleuth to realize distributed link tracking, Bus to implement message bus, Dashboard to realize monitoring dashboard and so on. As Spring Cloud's open source community is more active, there are many new components are constantly being integrated, interested friends can continue to follow!

The operation and maintenance form of micro-service

Under the micro-service architecture, with the massive growth of the number of services, the workload of online deployment and maintenance will become very large, and if the original operation and maintenance model is still used, it will be difficult to meet the needs. At this time, the operation and maintenance team needs to implement the DevOps strategy, develop an automated operation and maintenance release platform, open up product, development, testing, operation and maintenance processes, and pay attention to R & D performance.

On the other hand, we also need to promote the containerization (Docker/Docker Swarm/Kubernetes) strategy, so that we can quickly scale the service nodes, which is also the inevitable requirement of the micro-service system.

The problem of flooding of microservices

Another problem that needs to be paid attention to here is how to manage and control micro-services in engineering after the implementation of micro-service architecture. It is not reasonable to split micro-services blindly, because it will cause the whole service invocation link to become unfathomable, make it difficult to troubleshoot problems, and waste online resources.

Reconstruction problem

In the process of the transformation from single architecture to micro-service architecture, refactoring is not only a very good way, but also a very important means to ensure the standardization of services and the rationalization of business system application architecture. However, generally speaking, in the stage of rapid development, it also means the rapid growth of the size of the team, and how to make the new team have something to do in a short period of time is also a very test of management level, because if a lot of people are recruited, and there is a state of excessive competition between them, there will be a situation that makes refactoring a little utilitarian, resulting in incomplete refactoring and avoiding important points. It leads to the situation that the appearance is very high-end micro-service architecture, but the business system is actually relatively bad.

In addition, refactoring is an important decision made after a certain stage, not only to re-split, but also to reshape the business system, so we must consider the system structure of application software and the cost of implementing them. don't be blind!

Postscript

The micro-service architecture based on Spring Cloud supports the whole system by integrating various open source components, but it needs to invade the business process of the service consumer in the aspects of load balancing, circuit breaker and flow control. So many people will think that this is not a very good thing, so the concept of Service Mesh (Service Grid) emerges. The basic idea of Service Mesh is to decouple the business system process through the deployment of host independent Proxy. This Proxy is not only responsible for service discovery and load balancing (no longer need separate registration components, such as Consul), but also responsible for dynamic routing, fault tolerance, current limit, monitoring metrics and security log functions.

At present, on the specific service components, a Service Mesh standardization working group called Istio is mainly supported and promoted by Google/IBM and other large manufacturers. Specific knowledge about Service Mesh will be communicated with you in the following content. The above is the whole content of this article, Laibo pay attention, and share more practical information later!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.