Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to solve the problem of two hours after Dubbo service starts

2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly explains "how to solve the two-hour problem of Dubbo service startup". The content of the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "how to solve the two-hour problem of Dubbo service startup".

Phenomenon

The phenomenon is that one day the test redeployed a dubbo application in the test environment and found that the application "didn't start up".

But after a few hours, it can slowly recover on its own, and be able to provide dubbo services.

But in fact, after my follow-up investigation, I found that at the beginning, it was not that it could not start up, but that the startup speed was very slow, so the service was provided only after the application had been started for a long time.

And the speed is so slow that it takes two hours.

One result is that the test is completely afraid to verify in the test environment, each verification of a function to repair a bug has to wait for two hours, who can bear this?

And after many observations, it does take about two hours to start the application each time.

Try to solve the problem

In the end, I couldn't stand the test, so I had to take a look at it as an "accident report writing expert".

When I learned about the phenomenon of the problem, I didn't take it seriously at all:

Don't even think about it, this is the main thread blocking, first to see if the database, Zookeeper and so on can not be connected to cause blocking during initialization-the experience of handling many accidents tells me.

So I called the matter back to the test and asked him to check with the operation and maintenance staff first, so as not to affect my Touch fish???? unless I had to.

When I saw the Wechat avatar of the tested students beating early the next morning, I was ready to accept another "worship boss?" When he replied, he received "everything on the network is normal, no one has moved, or there will be a strike if it is not resolved."

All right, I can't get over it.

First of all, the troubleshooting direction of this kind of problem should not be wrong, that is, the main thread is blocked, as to what caused the blocking can not be as wild as before.

I will restart the application and use jstack pid to print the thread snapshot to the terminal and pull it directly to the end to see what the main thread is doing.

The previous snapshots were normal:

Loading Spring-> connection Zookeeper-> connection Redis is executed in turn without blocking.

After a while, the application is not up yet. After I jstack again, I get the following information:

Flip source code

I've been waiting for more than a dozen minutes and the snapshots I get from jstack are all the same.

As shown in the figure, you can see that the main thread is stuck in line 303 of a method ServiceConfig.java in dubbo.

So I found the source code here:

To put it simply, the logic here is to get the native IP and register it with Zookeeper for other service invocations.

Further down and as in the stack, it is stuck at the Inet4AddressImpl.getLocalHostName.

But this is a native method, and our application can't interfere at all, and the final phenomenon is that it is very time-consuming to call this local method.

So the problem seems to be blocked here, and there is not much we can do about it.

Final solution

Since this is a native method, it has nothing to do with the application itself (indeed, the problem arises all of a sudden. )

Is it the problem of the server itself? if you think of getting the native hostname in the native method, does it have anything to do with this hostname?

This is tested on my own Ali cloud server, and the real test environment is not by that name.

Get the server hostname and then try the ping hostname. A strange phenomenon occurs:

The command will initially be stuck for a period of time (about tens of seconds), and then the ip corresponding to hostname and the corresponding delay will be output.

When I ping the ip directly, I can quickly respond to the later output.

Finally, I tried to add the corresponding host configuration to the / etc/hosts configuration file:

Xx.xx.xx.xx (ip) hostname

The effect of ping hostname again is the same as that of direct ping ip.

Thank you for your reading, the above is the content of "how to solve the two-hour Dubbo service startup problem". After the study of this article, I believe you have a deeper understanding of how to solve the two-hour Dubbo service startup problem, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report