Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to troubleshoot problems caused by zuul version upgrade

2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article introduces the relevant knowledge of "how to troubleshoot the problems caused by zuul version upgrade". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

cause

The reason for this is that some early service versions are now too low, basically SpringBoot1.5.x, so we are going to upgrade the service to 2.1.x and SpringCloud`` version to Greenwich. Of course, the old version of zuul that we use needs to be upgraded.

Unexpected Bug

Our gateway uses zuul and spring-cloud-netflix packaged packages. This version upgrade simultaneously upgrades the relevant packages. But something unexpected happened, and we found that there was an exception in uploading the file in the test environment. The specific performance is as follows: when the uploaded file exceeds a certain size, the previously uploaded package disappears when it passes through the zuul gateway and forwarded to other services. The situation was so strange that the investigation began immediately.

Bug troubleshooting

When such a problem arises, the first reaction is to test whether there is no uploaded packet at all, so of course the packet cannot be forwarded to the next layer, of course, this idea is quickly rejected. All right, let's do a serious investigation.

First of all, we track the route and the specific logs that appear, locate the problem to the zuul service, and rule out the possibility of problems with the upstream nginx and downstream business services. However, the zuul service does not have any exception logs, so it is very disturbing. After inspection, it was found that the file did pass the zuul, but then disappeared without leaving a trace.

Obviously, when considering the problem of uploading files, two g of memory was allocated to zuul, how could there be a problem with uploading 500m files? Wrong! At this time, I had a flash of inspiration, will it have anything to do with the garbage collection mechanism? Our files are very large, and the large objects generated by such large files will be saved on the java heap, and due to the garbage collection mechanism, such objects will not go through the younger generation and will be directly allocated to the old age, is it possible that the old age is too small to put down because of our unreasonable memory parameter settings? To achieve this, we ensured that there was at least one G space in the old age by adjusting the jvm parameters, and synchronously detected the state of java's heap memory. However, it is disappointing that it did not work. But things are different from the beginning, and we have a clue. Some anomalies were found in the memory monitoring of the heap just now, and it is reasonable to suspect that insufficient memory in the heap caused the oom. Then try to increase the memory and run it again, only to find that the upload was successful. Sure enough, it was the oom caused by the lack of memory in the old years, but although the upload was successful, the memory in the old years was actually occupied by 1.6g or so. Why did it occupy such a large amount of memory when it was a 500m file?

Although we have found the reason, increasing memory is obviously not the solution to the problem, so we have added-XX:+HeapDumpOnOutOfMemoryError-XX:HeapDumpPath=/data to the startup parameters to view the specific analysis log of oom.

Looking at the stack information, we can see that the overflow occurs on a copy of the byte array. We quickly locate the code and find the following code:

Public InputStream getRequestEntity () {if (requestEntity = = null) {return null;} if (! retryable) {return requestEntity } try {if (! (requestEntity instanceof ResettableServletInputStreamWrapper)) {requestEntity = new ResettableServletInputStreamWrapper (StreamUtils.copyToByteArray (requestEntity));} requestEntity.reset () } finally {return requestEntity;}}

This code originates from the fact that RibbonCommandContext is called when the request is forwarded in zuul, and the specific OOM occurs when StreamUtils.copyToByteArray (requestEntity); is called. Continue to enter the method to find the source. Finally, after investigation, the source of the spill was found. A copy of ByteArrayOutputStream is used in ribbon forwarding, as follows:

Public synchronized void write (byte b [], int off, int len) {if ((off)

< 0) || (off >

B.length) | | (len

< 0) || ((off + len) - b.length >

0) {throw new IndexOutOfBoundsException ();} ensureCapacity (count + len); System.arraycopy (b, off, buf, count, len); count + = len;}

You can see that there is an ensureCapacity here. Check the source code:

Private void ensureCapacity (int minCapacity) {/ / overflow-conscious code if (minCapacity-buf.length > 0) grow (minCapacity);} private void grow (int minCapacity) {/ / overflow-conscious code int oldCapacity = buf.length; int newCapacity = oldCapacity 0) newCapacity = hugeCapacity (minCapacity); buf = Arrays.copyOf (buf, newCapacity);}

You can see that ensureCapacity has done one thing, that is, when the size of the byte array is not enough when the stream is copied, then call grow to expand the capacity. Unlike ArrayList, grow expands the array twice each time.

At this point, the reason for the overflow is clear. The reason for the overflow is that the 500m file occupies 1.6g because it just triggers the expansion, resulting in double the space used to accommodate the copied file, plus the source file, so it takes up three times the space of the file.

Solution

As for the solution, adjusting the memory footprint or the proportion of the old days is obviously not a reasonable solution. If we look back at the source code, we can see this part

If (! retryable) {return requestEntity;}

If the setting is not retried, the information in the body will not be saved. Therefore, we decided to temporarily remove the retry of the services involved in uploading files, and then modify the upload mechanism to bypass zuul when uploading files later.

find by hard and thorough search

Although we found the reason and there is a solution, we still don't know why the old version is ok, so we find the source code of the old version of zuul in the spirit of getting to the bottom of it.

The new version of ribbon code integrates spring-cloud-netflix-ribbon, while the old version of ribbon code is integrated in spring-cloud-netflix-core, so it takes a little time to find the corresponding code, check the difference, and find that the old version of getRequestEntity does not have any processing, and directly returns requestEntity.

Public InputStream getRequestEntity () {return requestEntity;}

A copy mechanism was added immediately in later versions. So we went to github to find the original commit.

And then we followed the information given in commit to find the original issue.

After checking the issue, it is found that this is an old version of bug. This bug will cause the old version of post requests to lose body during retry, so it has been fixed in the new version. When the request is post, the body will be cached to facilitate retry.

This is the end of the content of "how to troubleshoot problems caused by zuul version upgrade". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report