How to analyze data consistency in micro-service architecture 07/04 Update SLTechnology News&Howtos

How to analyze data consistency in micro-service architecture

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article will explain in detail how to analyze the data consistency in the micro-service architecture. The content of the article is of high quality, so the editor will share it with you for reference. I hope you will have some understanding of the relevant knowledge after reading this article.

In microservices, a logical atomic operation can often span multiple microservices. Even a single-chip system may use multiple databases or messaging solutions. With multiple independent data storage solutions, if one of the distributed process participants fails, we face the risk of data inconsistency-such as charging the customer without placing an order or not notifying the customer that the order was successful. In this article, I'd like to share some of the techniques I've learned to make the data between microservices ultimately consistent.

Why is it so challenging to achieve this goal? As long as we have multiple places to store data (not in a single database), we cannot automatically solve the consistency problem, and engineers need to pay attention to consistency when designing the system. At present, in my opinion, the industry does not have a well-known solution to automatically update data in many different data sources-we probably shouldn't wait to get one soon.

One attempt to solve this problem in an automatic and barrier-free manner is to implement the XA protocol in two-phase commit (2PC) mode. But in modern high-scale applications, especially in the cloud, 2PC seems to perform poorly. In order to eliminate the shortcomings of 2PC, we must trade ACID for BASE and cover the consistency issue in different ways as required.

Saga mode

The most famous way to deal with consistency issues in multiple microservices is the Saga pattern. You can think of Sagas as application-level distributed coordination of multiple transactions. Depending on the use cases and requirements, you can optimize your own Saga implementation. Instead, the XA protocol attempts to cover all scenarios. The Saga mode is not new either. It was known in the past and used in ESB and SOA architectures. * *, it has been successfully transformed into a micro-service world. Each atomic business operation that spans multiple services may contain multiple transactions at the technical level. The key idea of Saga Pattern is to be able to roll back one of the individual transactions. It is well known that individual committed transactions that are ready to be used out of the box cannot be rolled back. But this is achieved by introducing a compensation operation-by introducing a "cancel" operation.

In addition to canceling, you should also consider making your service idempotent so that you can retry or restart some operations in the event of a failure. Faults should be monitored and dealt with proactively.

Reconciliation

What if the system responsible for invoking the compensation operation crashes or restarts in the middle of the process? In this case, the user may receive an error message and should trigger the compensation logic, or-when processing an asynchronous user request, the execution logic should be restored.

To find crashed transactions and resume operations or apply compensation, we need to coordinate data from multiple services. Reconciliation

It is a technology familiar to engineers working in the financial field. Have you ever thought about how the bank can ensure that your capital transfer is not lost, or how to remit money between two different banks? The quick answer is reconciliation.

In accounting, reconciliation is the process of ensuring that two sets of records (usually the balance of two accounts) are agreed. The reconciliation is used to ensure that the funds leaving the account match the funds actually spent. This is done by ensuring that the balance matches at the end of a particular accounting period. -Jean Scheid, "understanding balance sheet account adjustments", Bright Hub, 8 April 2011

Going back to microservices, using the same principle, we can coordinate data from multiple services on some action triggers. When a fault is detected, the operation can be triggered as planned or by the monitoring system. The easiest way is to run a record-by-record comparison. You can optimize the process by comparing aggregate values. In this case, one of the systems will become the true source of each record.

Event book

Imagine a multi-step transaction. How do I determine which transactions may have failed and which steps have failed during the reconciliation? One solution is to check the status of each transaction. In some cases, this feature is not available (imagine a stateless mail service that sends e-mail or generates other types of messages). In other cases, you may want to know the status of the transaction immediately, especially in complex scenarios with many steps. For example, multi-step orders for booking flights, hotels and connecting flights.

Complex distributed process

In these cases, the event log can help. Recording is a simple but powerful technique. Many distributed systems rely on logs. "Prewritten logging" is a way for a database to implement transactional behavior internally or to maintain consistency between replicas. The same technology can be applied to microservice design. Before making actual data changes, the service writes log entries about its intention to make the changes. In fact, the event log can be a table or collection in a database owned by the orchestration service.

Event logs can be used not only to resume transactions, but also to provide visibility to system users, customers, or support teams. However, in a simple scenario, the service log may be redundant, and the status endpoint or status field is sufficient.

Allocation (Orchestration) and orchestration (choreography)

So far, you might think that sagas is just part of the orchestration scenario. But sagas can also be used for choreography, and each microservice knows only part of the process. Sagas includes knowledge of both positive and negative flows that handle distributed transactions. In choreography, every distributed transaction participant has this knowledge.

Single write event

The consistency solution described so far is not easy. They're really complicated. But there is a simpler way: modify one data source at a time. Instead of changing the state of the service and emitting events in one process, we can separate these two steps.

Change to first

In major business operations, we modify the state of our services, while separate processes reliably capture changes and generate events. This technique is called change data capture (CDC). Some of the techniques that implement this method are Kafka Connect or Debezium.

Change data capture using Debezium and Kafka Connect

However, sometimes a specific framework is not required. Some databases provide a friendly way to tail up their operation logs, such as MongoDB Oplog. If you do not have such functionality in the database, you can poll for changes by timestamp, or use the immutable ID query that was last processed to record the changes. The key to avoiding inconsistencies is to make data change notification a separate process. In this case, the database record is a single source of fact. The change is captured only when the change occurs first.

Data capture can be changed without specific tools

The disadvantage of changing data capture is the separation of business logic. The change capture process is likely to exist in your code library separately from the change logic itself-which is inconvenient. The best-known change data capture applications are domain-independent change replication, such as sharing data with a data warehouse. For domain events, * uses different mechanisms, such as explicitly sending events.

Event *

Let's look at a single source of facts that is upside down. If you don't write to the database first, you trigger an event and then share it with yourself and other services. In this case, the event becomes the only source of fact. This will be in the form of an event source, where our own service state effectively becomes a read model, and each event is a write model.

Event priority method

On the one hand, it is a command query responsibility isolation (CQRS) mode, and we separate the read and write models, but CQRS itself does not focus on the most important part of the solution-the use of multiple services to consume events.

In contrast, the event-driven architecture focuses on events consumed by multiple systems, but does not emphasize that events are the only atomic part of data updates. So I want to introduce "event first" as the name of this approach: update the internal state of microservices by emitting a single event-including our own services and any other micro-services of interest.

The challenge of the event first approach is also the challenge of CQRS itself. Imagine that we want to check the availability of the goods before placing an order. What if two instances receive an order for the same project at the same time? Both check the inventory in the read model and place the order event at the same time. If we don't have some kind of coverage, we may run into trouble.

A common way to deal with these situations is optimistic concurrency: put the read model version into the event, and if the read model has been updated on the consumer side, ignore it on the consumer side. Another solution is to use pessimistic concurrency control, such as creating locks for a project when checking its availability.

Another challenge of the event first approach is the challenge of any event-driven architecture-the sequence of events. Multiple concurrent consumers processing events in the wrong order may give us another consistency problem, such as processing orders from customers that have not yet been created.

A data flow solution such as Kafka or AWS Kinesis ensures that events related to a single entity will be processed sequentially (for example, orders are created for the customer only after the user is created). For example, in Kafka, you can partition topics by user ID so that all events related to a single user are handled by a single consumer assigned to that partition, allowing them to be processed sequentially. In contrast, in Message Brokers, a message queue has an order, but multiple concurrent consumers process messages in a given order, if not impossible. In this case, you may encounter concurrency problems.

In fact, it is difficult to implement the event first approach in situations where linearization is required or where there are many data constraints, such as uniqueness checking. But it is really useful in other situations. However, because of its asynchronous nature, the challenges of concurrency and competitive conditions still need to be addressed.

Design consistency

There are many ways to split the system into multiple services. We try to match individual microservices to separate domains. But how detailed is the domain name? It is sometimes difficult to distinguish a domain from a subdomain or aggregate root. There are no simple rules to define your microservice split.

I suggest being pragmatic and considering all the implications of the design solution, rather than just focusing on domain-driven design. One of the effects is the alignment of microservice isolation to transaction boundaries. Systems where transactions only reside in microservices do not require any of the above solutions. We must consider the transaction boundary when designing the system. In practice, it may be difficult to design the entire system in this way, but I think we should focus on reducing data consistency challenges to a limited extent.

Accept inconsistencies

While matching account balances is critical, there are many use cases in which consistency is less important. Imagine collecting data for analytical or statistical purposes. Even if we randomly lose 10% of the data from the system, it is likely that it will not affect the business value of the analysis.

Share data with events

Which solution do you choose?

The atomic update of data requires an agreement between two different systems, and an agreement is reached if a single value is 0 or 1. When it comes to microservices, it boils down to consistency between two participants, and all practical solutions follow a rule of thumb:

At a given time, for each data record, you need to find a data source that the system trusts.

The source of the fact may be an event, a database, or one of the services. It is the developer's responsibility to achieve the consistency of the micro-service system. My method is as follows:

Try to design a system that does not require distributed consistency. Unfortunately, for complex systems, this is almost impossible.

Try to reduce the number of inconsistencies by modifying one data source at a time.

Consider an event-driven architecture. In addition to loose coupling, the powerful advantage of an event-driven architecture is a natural way to achieve data consistency by using events as a single source of fact or by generating events as a result of changing data capture.

More complex scenarios may still require synchronous calls between services, fault handling, and compensation. Know that sometimes you may need to reconcile later.

Designing your service functionality is reversible, determining how to handle failure situations and achieving consistency early in the design phase.

This is the end of how to analyze the data consistency in the micro-service architecture. I hope the above content can be helpful to you and learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.