What is the solution for service invocation in microservices? 07/12 Update SLTechnology News&Howtos

What is the solution for service invocation in microservices?

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

What is the solution to service invocation in microservices, many novices are not very clear about this, in order to help you solve this problem, the following small series will explain in detail for everyone, people who have this need can learn, I hope you can gain something.

Microservices have become the object of research by more and more enterprise IT departments, gradually tending to be hot, and several people have shared them with you before: "Record| What changes will microservice enterprise-level landing bring? In-depth elaboration of the benefits that microservices can bring, but as an emerging technology, it is inevitable to encounter one kind or another of difficulties. Today, let's take a look at one of the difficulties of microservices: the solution of service invocation.

Microservices have many difficulties, such as distributed systems:

As we continue to explore these concepts of service architecture, inevitably some difficult issues will be discussed, and the author hopes to have seen and digested the errors of distributed computing when exploring and implementing microservices, and the author also recommends Jeff Hodges 'Notes on Distributed Systems for Young Bloods.

Most of them have been solving these problems in outdated ways and have been looking for solutions to "how do developers write business logic that delivers value and tries to abstract distributed systems". What's done is to make service invocations look like native invocations through abstract networks with native interfaces.(CORBA, DCOM, ejb, etc.), but later found out that this is not a good approach, and then switched to WSDL/SOAP/code generation something similar to get rid of the vulnerabilities of these other protocols, but still use the same practices (SOAP client code generation), some people do make these methods work, but there are also many shortcomings, here are some problems encountered when simply invoking services:

incorrect or delayed

What happens when we send messages to services? For the purposes of this discussion, the request is broken up into small chunks and routed through the network.

Because of this "network," we deal with the idea of distributed computing errors, applications communicate over asynchronous networks, which means that there is no single and uniform understanding of time, services work with their own understanding of the "meaning of time," which may be different from other services, and more importantly, these asynchronous networks route packets based on availability paths, congestion, hardware applause, etc., without guaranteeing that messages will reach their recipients in a limited time.(Note that this same phenomenon occurs in "synchronous" networks without a uniform understanding of time.)

This is bad, it's impossible to be sure now that it's wrong or just slow, and if a customer asks to search for a concert on a ticketing site, they don't want to wait until the end of time, hoping to get a response, and at some point, the request fails, so you need to increase the timeout for service, but not just increase the service time.

When processing downstream requests, do not slow down because of the speed of Down-Stream network interaction, and have some time to think about setting:

How long does it take to establish a link to a downstream service so that a request can be sent

Did you receive a reply?

Additional note: The great advantage of building systems as service architectures is speed, including the speed of making changes to the system, the emphasis on autonomy and frequent deployment of the system, but when you go to do this you quickly find that, in all strange cases, you work overtime and don't do well.

Considering that the client application sets 3 timeouts to get the corresponding from the recommendation engine, but the recommendation engine also consults the correlation engine, so it issues a timeout call set to 2S, this should not be a problem because there are still service calls waiting for up to 3, but what if the correlation engine has to talk to the promotion service? What if the timeout is set to 5S? The correlation engine for our tests (unit, local, integrated) seems to pass all tests, even under potential operations, because the timeout is set to 5S, it doesn't take that long to promote the service, or after the timeout expires, the correlation engine ends the call appropriately.

What you end up with is a cumbersome, hard to debug state when many calls come in (which can happen at any time because the network is "asynchronous"), timeout being an important condition.

retry

Since there really isn't any elastic time guarantee in distributed systems, you need to timeout when the task takes too long, and now the situation is "What do I do after timeout? "Is it throwing a bad HTTP 5XX at the caller? Do you accept suggestions for microservices to be flexible, committed to theory and callback? Or should we try again?

To retry, what happens if a call is made to change the data of a downstream service? As the author has mentioned in other articles, one of the hardest parts of microservices is data.

But more interestingly, what if the downstream service starts failing and eventually tries all the requests again? What if a recommendation engine with 10 or 100 instances calls the correlation engine, but the correlation engine times out? We ended up with a variant of the Thundering Herd Problem, which ended DDoS service when we tried to remediate and slowly restore the affected service, even though we tried to remediate and slowly restore the affected service.

Attention needs to be paid to retry strategies. A fallback of exponential retry will help, but you may still encounter the same problem.

routing

Because services are deployed with resiliency in mind, ideally, multiple instances should be run in different fault-tolerant zones, such that some zones fail without reducing service availability, but when events begin to fail, a way is needed to bypass these failures, but there may be other reasons for service routing in fault-tolerant zones. Perhaps certain "zones" are implemented as geographical deployments of backup services; perhaps it is too expensive from a latency perspective for our traffic to access backup instances during normal operation.

Maybe you want to route client traffic like this, but what about inter-service communication? In order to satisfy client requests, a reverse conversation must take place between services, so how is the routing chosen?

The routing changes performed by the fault-tolerant zone may be routing and Load Balancer requests that are made in the event of periodic service anomalies, and it makes little sense to want to adjust routing to services that can be consistent with service invocations, sending traffic to services that cannot keep up.

When discussing how to deploy a new version of a service, one should consider the trickier question, as mentioned earlier, of wanting to maintain some degree of autonomy between services so that changes can be iterated and pushed quickly, without wanting to break dependent services, so if some traffic can be routed to the new version and tied to the build and release strategy (i.e. Blue/Green, A/B testing, Canary release), it quickly becomes complicated, how to determine the route? Maybe you're testing in a special "staging environment," maybe you're just doing an unannounced Dark Launch, maybe you're doing A/B testing, but if there's a state, you need to think about data schema evolution, multiversion database implementation, etc.

service discovery

In addition to some of the elasticity considerations discussed earlier, how do you discover collaborative services in an environment where Expect fails? How do you know where they are and how do you communicate with them? In a painfully static topology, applications will be configured to require URLs/IPs for services in question, and these dependent services will also be constructed to "never fail" but inevitably they will fail (or partially fail) in some unacceptable way, but in a resilient environment downstream services can automatically scale, transform different fault tolerance areas, or simply be removed and restarted by some automated system Clients of these services need to be able to discover them at runtime and consume them regardless of topology.

This isn't a new problem to solve, and it gets harder in resilient environments, where something like DNS doesn't work (unless it's running in Kubernetes).

Trust your own service architecture

This article leaves the most important considerations/questions to the end. Microservices can change the system architecture quickly. If you can't trust your system, you will hesitate to change it, which will slow down the release speed. If the deployment is slow, it will mean longer cycle times, which will lead to the possibility of trying to deploy larger changes. More coordination between teams will be needed. Before killing the enterprise IT capability, you will feel "This is our attitude". Does that sound familiar?

There may be some complex mix of infrastructure (physical/virtual/private cloud/public cloud/hybrid cloud/containers), there is a lot of middleware, services, frameworks, languages and deployment of each service, if a request is slow, where to start?

In a system that needs strong "observability," useful and effective logging, metrics, and tracing, all parts of the system need to be able to provide reliable observability if services are iterated and released quickly, data driven, understanding the impact of changes, the significance of rollbacks, or quickly releasing new versions to deal with negative impacts (logging/metrics/monitoring), the more you trust this data, the more you trust your service architecture, and the more confident you are to make improvements based on the situation.

recalling

Recall the problems that need to be solved when services invoke other services:

service discovery

Automatic Adaptive Routing/Client Load Balancer

automatically retry

timeout control

speed limit

Indicators/statistical data collection

monitoring

A/B testing

Service refactoring/request tracking

Service life/timeout execution across service invocations

security services

Edge Gateway/Router

Mandatory Service Quarantine/Outlier Detection

Build/Dark Boot

solutions

If you look at the Google/Twitter / Amazon / Netflix guys solving this problem, you see that they actually solve this problem violently, basically saying, we're going to use Java/c++/Python, and we're going to put in a lot of engineering effort to build libraries to help developers solve this problem. "So Google created Stubby, Twitter created Finagle, Netflix created and open-sourced Netflix OSS, and so on, and so on.

But for others, there is a major problem with this approach:

You're not one of these giant companies.

It is unrealistic to invest a lot of human capital resources in solving these problems. Some companies interested in microservices are actually interested in value and speed of innovation, but they do not have specialized domain knowledge.

Maybe they can reuse their solutions? But it raises another problem: in the end, you'll end up with a very complex, ad hoc, partially implementable solution.

For example, the Java Store, which the author interacts with many customers, is thinking about solving this problem for Java services, and naturally, it will be attracted to Netflix's OSS or Spring Cloud, but what about NodeJS services? What about Python services? What about legacy apps? What about Perl scripts?

Each language has its own implementation of these problems, each implemented to a different quality, it is not as simple as grabbing open source libraries and populating them into an application, each implementation must be tested and verified, it is not the libraries that are responsible, but the service architecture, and the possible implementation proliferation/language/version soon becomes an insurmountable complexity.

The author is implementing lower-level network functions at the application layer and likes Oliver Gould when he mentions these issues (such as routing, retry, speed limit, circuit break, etc.), which are considered at the 5-layer point of view:

So why complicate these things? Applications that have been trying to solve these problems by creating libraries (service discovery, tracing, different data sets, doing more complex routing, etc.) and interfering with the application space (dependencies, passing dependency library calls, etc.), what happens if the service developer forgets to add part of the implementation (such as tracing)? Therefore, it is the responsibility of each developer to implement these features, introduce them into the right libraries, write them in code, and so on.

Not to mention some frameworks that use annotated configurations in Java environments to mitigate these situations, the goals of trust, observability, debugging, etc. are ignored by such approaches.

The author prefers a more elegant way of doing this:

IMHO This is where the Service Agent/Sidecar pattern can help if these constraints can be solved:

Reduce any application-level perception, if any, to trivial libraries

Implement all of these features in one place, rather than a dumping ground where dependencies are scattered

Make observability a first-class design goal

Make it transparent to services, including legacy services

Very low overhead/resource impact

Work for any or all languages/frameworks

Push these considerations down the stack

Did reading the above help you? If you still want to have further understanding of related knowledge or read more related articles, please pay attention to the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.