How to understand the high availability and high performance design under the micro-service architecture 04/18 Update SLTechnology News&Howtos

How to understand the high availability and high performance design under the micro-service architecture

2025-04-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article mainly explains "how to understand high availability and high performance design under micro service architecture". The explanation in this article is simple, clear and easy to learn and understand. let's study and learn "how to understand high availability and high performance design under micro service architecture".

Three Dimensions and interrelationships of High availability

The high availability of business systems actually includes three aspects: high reliability, high performance and high scalability. Moreover, the three aspects depend on and influence each other.

We can use the following figure to describe the relationship among the three.

The above figure shows the relationship among high reliability, high performance and high scalability.

For high reliability, the traditional HA architecture and redundant design can meet the requirements of high reliability, but it does not mean that the system has high performance and scalability. On the other hand, when the system has high scalability, we usually take into account both redundancy and high reliability when designing scalability, such as the cluster technology we often talk about.

For high performance and high scalability, high scalability is a necessary condition for high performance, but not a sufficient condition. The high performance of a business system is not simply the ability to expand, but requires the software architecture design of the business system itself, and all aspects of code writing meet the requirements of high-performance design.

On the contrary, for high reliability and high performance, the two show a mutual restriction relationship, that is, in the state of high performance support, it often poses a severe challenge to the high reliability of the system, and it is precisely for this reason that we see various measures such as current-limiting circuit breaker, SLA service degradation and other measures to control large concurrent access and invocation in abnormal conditions.

High availability of database

When I talked about the micro-service architecture earlier, I mentioned that under the micro-service architecture, the traditional single application should be split, which is not only the split of the application layer components, but also the split of the database itself.

If a traditional monolithic application is planned for 10 microservices, it may be split vertically into 10 separate databases. This actually reduces the performance load faced by each database itself and improves the overall processing capacity of the database.

At the same time, although various cross-database queries and distributed transactions are introduced after the split, many cross-database operations and replicated data processing calculations are not completed in the database, and the database is more likely to provide a simple CRUD operation interface, which is also a key to improve the performance of the database.

If you use Mysql database.

To meet high reliability, you can use Dual-Master dual-master architecture, that is, two primary nodes are dual-active, but only one node provides database interface capability, and the other node synchronizes database logs in real time as a backup node. When the primary node fails, the standby node is automatically transformed into the primary node service.

Simple dual-host architecture between the two nodes to install the agent, through Binlog log replication, the upper layer through similar Haproxy+Keepalive through VIP floating IP to provide and heartbeat monitoring.

We can see that the dual-host architecture is more of a highly reliable service.

If you want to meet high performance, read-write separation clusters are often used. That is, one master node undertakes the read and write operation, and multiple slave nodes undertake the read operation. The slave node still synchronizes the master node information through the Binlog log. When a data access request comes in, the front-end Proxy can automatically analyze whether it is a CUD request or an R-read request to route and forward the request.

When we add an order, we need to quickly refresh the current order list interface when the addition is successful. The second refresh itself is a read operation, but it is tightly bound to the previous write, so it is not suitable to read data from the Slave node. At this point, you can explicitly specify whether to still get the data from the master node when you make the Sql call.

Of course, a combination of both may be needed most of the time, providing both sufficient high reliability and sufficient high performance. Therefore, when building a Mysql cluster, it needs not only two master settings, but also multiple slave node settings.

Under the logical deployment architecture shown above, the requirements of high reliability and high performance can be met at the same time. However, as can be seen from the above architecture deployment, both the master and slave of the standby node are in a state where hot backup cannot actually provide capacity.

Is it possible to attach all Slave to one Master?

If designed in this way, automatic drift of multiple Master nodes is required when the primary Slave fails. On the one hand, the overall implementation is more complex, in addition, the reliability is not as good as the above architecture.

Thoughts on the performance expansion of Database

First, let's take a look at some potential problems with the previous architecture itself:

The first is that CUD operations are still provided by a single node. For read operations that account for most of the scenarios, good performance expansion can be achieved through dual-master + read-write separation clusters. However, performance problems can still occur if CUD operations are frequent.

Secondly, the database performance problem is generally divided into two levels, one is the performance under large concurrent requests, which can be solved by cluster load balancing, and the other is the fuzzy query performance of a single request to access a large database table. this is solved by the load of the service.

In other words, in the above design, there may still be obvious performance problems in large concurrent CUD operations, associative queries or fuzzy query operations on large data tables.

How to solve this problem?

To put it simply, write changes synchronization to asynchronous through message middleware, and carries out front-end peak trimming. For the query, the content cache or create a secondary index to improve the efficiency of the query.

For the query itself, it also includes partial structured data query and processing, similar to using Redis library or Memcached for caching; while for unstructured data, similar message messages, logs, etc., use Solr or ElasticSearch to build a secondary index and achieve full-text retrieval capability.

When faced with a large number of data write operations, the performance of a single Master node is often difficult to support, so it is necessary to use message middleware such as RabbitMQ,RocketMQ,Kafka for asynchronous peak processing. This asynchrony actually involves two levels of asynchrony.

One is asynchronous interface services such as sending text messages, logging, starting processes, and so on. The other is asynchronous for long time-consuming write operations, first feedback the user's request to receive, and then inform the user to get the result.

For query operations, the concurrent queries mentioned earlier can be loaded with a cluster.

But for the big data scale, such as the fuzzy query of the large table with hundreds of millions of records, this piece must be indexed at a secondary level. Even if there is no concurrent query for such a large data table, if there is no secondary index, the query efficiency and response speed are still very slow.

Enable distributed storage for semi-structured information

For semi-structured information such as logs and interface service call logs, there is a large amount of data. If all of them are stored in a structured database, there is a great demand for storage space and it is difficult to expand. Especially when the previous Mysql clustering scheme itself uses local disks for storage.

Therefore, it is necessary to clear the history log, migrate the history log to a distributed repository, such as Hdfs or Hbase, and then build a secondary cache capability based on the distributed storage.

Build DaaS data tier for horizontal scaling

As mentioned earlier, the micro-service has been vertically expanded after the split. For example, an asset management system can be divided into 10 micro-service modules, such as asset addition, asset allocation, asset depreciation, asset inventory and so on.

However, after the split, it is still found that the amount of asset data is very large. For example, in a large project such as group centralization, we can see that the asset data sheet of a province is close to hundreds of millions of records. At this time, it is not realistic to centralize all provincial data in one database. Therefore, it is necessary to further split horizontally by province or organizational domain.

After the horizontal split, the DaaS layer is built on the upper layer to provide unified external access.

Application cluster expansion

The application cluster expansion is actually simpler than the database layer, and the application middleware layer can easily combine the cluster management nodes or independent load balancing hardware or software to expand the cluster capability. For application cluster expansion, it is the key way to improve the overall performance. There are still some issues that need to be further discussed in the process of cluster expansion.

The cluster is completely stateless.

If the cluster is completely stateless, then the cluster can be combined with load balancing devices or software to achieve load balancing scalability. For example, F5 or radware are commonly used in hardware, and software such as HAProxy,Nginx.

How to deal with Session session information? For Session itself, it is stateful, so for Session information, consider storing it in a database or Redis cache.

Cluster nodes often need to read some global variables or configuration file information when starting up, which is often difficult to centralize management if they simply exist on the local disk. Therefore, the current mainstream idea is to enable the global configuration center to uniformly manage the configuration.

If there is file upload and storage in the implementation of the application function, then the local storage of these files on disk itself is stateful, so these files themselves also need to be realized through file service capability or distributed object storage service capability.

Under the micro-service architecture, there is interface interaction and collaboration among the micro-services, and the specific address information of interface calls also needs to be obtained through the service registry, which can be cached locally, but there must be a real-time update mechanism after changes.

Four-tier load and seven-tier load

First, take a look at the simplest description of a four-tier load and a seven-tier load:

Layer 4 load: work at layer 4 of OSI, that is, layer TCP, and load balancing can be carried out according to the IP+ port. This kind of Load Balance does not understand application protocols (such as HTTP/FTP/MySQL, etc.).

Layer 7 load: works at the highest layer of OSI, the application layer, and can be load balanced based on Http protocol and URL content. At this point, the load balancer can understand the application protocol.

At present, we can see that hardware load balancers such as F5Magee Array also support layer 7 load balancing. at the same time, we can also set advanced features such as session persistence during layer 4 load balancing. To understand that the essence of layer 4 load balancing is forwarding, while the essence of layer 7 load is content exchange and proxy.

In other words, when there is no need for state retention and content-based routing, we can enable layer-4 load balancing to achieve better performance.

After the separation and development of the front and back end of the micro-service architecture.

The back-end micro-service component can fully provide Rest API interface service capability, so it is stateless. For the front-end micro-service components directly facing the end-user access, it is necessary to maintain the Session state. In this case, a two-tier load balancing design can be carried out, that is, the front end uses seven layers of load, and the back end uses four layers of load balancing.

Front-end cache

Front-end cache is mainly divided into HTTP cache and browser cache. The HTTP cache is the cache used in the HTTP request transmission, which is mainly set on the server code, while the browser cache is mainly set on the front-end js by the front-end developers. Caching can be said to be a simple and efficient way of performance optimization. An excellent caching strategy can shorten the distance of web page request resources, reduce latency, and because cache files can be reused, it can also reduce bandwidth and network load.

For details, please refer to:

Https://www.jianshu.com/p/256d0873c398

Analysis and diagnosis of Software performance problems

For business system performance diagnosis, from a static point of view, we can consider classifying it from the following three aspects

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

Operating system and storage level

Middleware level (including database, application server middleware)

Software level (including database SQL and stored procedures, logic layer, front-end presentation layer, etc.)

Then there is something wrong with the application function of a business system, of course, we can also look at the actual code and hardware infrastructure of an application request from the call, and locate and query the problem by segmenting the method.

For example, what we often see is that if there is a problem with a query function, the first thing is to find out whether the SQL statement corresponding to this query function is slow in the background. If the SQL itself is slow, then optimize the SQL statement. If the SQL itself is fast but the query is slow, it is important to see if it is a front-end performance problem or a cluster problem.

The problem of software code is often a performance problem that can not be ignored.

For business system performance problems, we often think of expanding the hardware performance of the database, such as expanding CPU and memory, expanding clusters, but in fact, we can see that the performance problems of many applications are not caused by hardware performance, but by the performance of software code. I have talked about the common performance problems of software code in previous blog articles, including the more typical ones.

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

Initialize large structural objects, database connections, etc., in the loop

Memory leakage caused by non-release of resources, etc.

There is no moderate improvement in performance through caching and other ways based on scenario requirements.

Long-cycle transaction processing consumes resources

When dealing with a business scenario or problem, no optimal data structure or algorithm is selected

All of these are common performance problems in software code, which can only be found through Code Review or code review. Therefore, if you want to do comprehensive performance optimization, it is necessary to troubleshoot the performance problems of the software code.

The second is that performance problems can be found through APM performance monitoring tools.

In traditional mode, when CPU or memory is full, it is often not easy to find out which application, which process or which business function is caused by which sql statement. In the actual performance problem optimization, we often need to do a lot of log analysis and problem positioning, and finally find the problem point.

This problem can be well solved by APM.

For example, in our recent project implementation, combined with APM and service chain monitoring, we can quickly find out which service invocation has a performance problem, or quickly locate which SQL statement has a verified performance problem. This can help us to analyze and diagnose performance problems quickly.

The resource carries the application, and the application itself includes the database and the application middleware container, as well as the front end; on the application, it corresponds to the specific business function. Therefore, one of the core of APM is to integrate and analyze and link up resources-"application -" functions. Find and solve the performance problems in the running of the application through APM.

Thank you for your reading. the above is the content of "how to understand the high availability and high performance design under the micro service architecture". After the study of this article, I believe you have a deeper understanding of how to understand the high availability and high performance design under the micro service architecture. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.