How to solve the memory leak problem 07/02 Update SLTechnology News&Howtos

How to solve the memory leak problem

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article introduces the relevant knowledge of "how to solve the memory leakage problem". In the actual case operation process, many people will encounter such difficulties. Next, let Xiaobian lead you to learn how to deal with these situations! I hope you can read carefully and learn something!

problem investigation

First determine when the memory leak problem occurred, and find that there are two code commits on the line at that time point, one of which is mine. So I immediately checked the two code changes and determined that another colleague's code could not have memory problems (because another colleague's online only modified the configuration). I knew that there must be a problem with my own code.

After identifying the problem, quickly roll back your code, and then you can rest assured that debug.

Debug

What is a memory leak?

Simply speaking, the memory applied by the programmer is not returned to the operating system after use. Since the author uses the C++ language, the memory leak is generally like this:

obj* o = new obj(); ... //obj is not deleted after use

There must be some place that requested memory without calling delete to free memory.

Here to introduce the author's code changes, my task is actually to refactor a piece of code, this code parallelization. That is, the old logic was executed serially in one thread. Now I want to put this logic into two threads for parallel execution. This is one of the most troublesome tasks. Parallel transformation is relatively buggy.

Next, I sorted out all the memory applications and releases in it, including:

Allocate freed memory using new/delete

Allocating free memory using memory pools

After carefully combing it, I didn't find any problems. The memory that should be released has already been released. At this time, the author has begun to doubt life:). Obviously, there is still a problem that has not been noticed. This is inevitable. Although I know that the problem must appear in the changed code, I am not sure where it appears.

There is no way, basically to give up their human debug here, want to use some memory Detection Tools to help themselves determine the problem.

Common memory leak Detection Tools include valgrind, gperftools, etc. The advantage of valgrind is that it can detect memory without recompiling code, but the disadvantage is that it will make the program run very slowly. The official document says that it will run 20-30 times slower than normal programs;gperftools need to recompile executable programs. These tools need to be downloaded and installed for testing, which also involves problems such as applying for machine permissions. I think it is still troublesome. Moreover, this problem is not like finding a needle in a haystack. The problem must be in the parallelization of this code.

Here I decided to change another way of thinking to investigate the problem, since the code refactoring began to execute in parallel, then the problem is likely to occur because of multi-thread problems, encountered multi-thread problems first focus on the investigation is the shared data between threads.

The Key to Multithread Problem--Sharing Data

We know that if there is no shared data between threads, then there will be no thread safety problem. The locks, semaphores, conditional variables, etc. we use are actually used to protect shared data. For example, locks are usually used to include critical sections, and the code in critical sections operates on thread shared data. A classic scenario for semaphores is the producer consumer problem. Producer threads and consumer threads will operate on the same queue. Here, the queue is shared data.

Following this line of thought, I started looking for shared data used in both threads. Sure enough, I found this code in a corner:

auto* pb = global->mutable_obj();

This is a piece of code that allocates protobuf objects. Protobuf is a technology developed by Google that is similar to JSON and XML, so it is often used in network communication and data exchange scenarios, such as RPC.

If you don't know protobuf, it doesn't matter. In fact, what the above code needs to do is this:

if (global->obj == NULL) { global->obj = new obj(); } return global->obj;

It's worth noting that this code will now execute in two threads, and that's where the problem obviously comes in.

So how did the problem arise?

We assume that there are two threads, thread A and thread B. When such a piece of code executes simultaneously in thread AB, the following scenarios may occur:

Thread A gets global->obj and detects that global->obj is empty at this time, so it decides to allocate memory for it, but unfortunately thread switch occurs at this time, thread A is suspended before allocating memory for global->obj, as follows:

if (global->obj == NULL) { obj = new obj(); } return global->obj;

Thread B starts executing after thread A is suspended. This code will also be executed once in thread B. Therefore, thread B will first check global->obj and find that it is empty, so it allocates memory for global->obj. After allocating memory, thread switch occurs, and thread B is suspended, as follows:

if (global->obj == NULL) { global->obj = new obj(); obj;

After thread B has been suspended, the scheduler decides to restart thread A, and thread A starts running again from where it was interrupted. Remember where thread A was interrupted? Yes, before allocating memory for global->obj. Thread A continues running, which means global->obj = new obj() is executed again, even though thread B has already allocated memory for global->obj.

Oops, a typical memory leak, the memory allocated by thread B can no longer be freed properly.

So far, we have found the cause of the problem, the culprit is shared data, the key point is to realize that your thread can be interrupted at any time, the CPU can switch to other threads at any time.

Code repair is also very simple, add another variable, two threads do not use shared data, here the problem is solved, from the discovery of the problem to the completion of the repair takes about 4 hours.

lessons

Parallel code refactoring is a very difficult task, it is easy to thread safety problems, to solve thread safety problems first to consider is not whether to lock, but whether it is really necessary for multiple threads to use shared data, there is no need for multiple threads to operate private data thread safety problems.

When thread safety problems occur, focus on investigating shared data used by threads as soon as possible.

Memory Leak Detection Tools

Although these do not use Detection Tools all rely on human flesh debug, in fact, or because the scope of the problem is relatively small, if we do not know that the problem occurred in that code change, then Detection Tools are very important, here is a brief introduction to the use of valgrind, please refer to the official documentation for detailed introduction.

Suppose there is a question code like this:

#include void f(void) { int* x = malloc(10 * sizeof(int)); x[10] = 0; //Question 1: Crossing} //Problem 2: Memory leak, x not freed int main() { f(); return 0; }

There are two problems with this code: one is out-of-bounds access to data; the other is a memory leak. Compile the program to myprog.

Next, check the program using valgrind, using the following command:

valgrind --leak-check=yes myprog

Valgrind will give you a detection report when it is finished, and output such as:

==19182== Invalid write of size 4 ==19182== at 0x804838F: f (example.c:6) ==19182== by 0x80483AB: main (example.c:11) ==19182== Address 0x1BA45050 is 0 bytes after a block of size 40 alloc'd ==19182== at 0x1B8FF5CD: malloc (vg_replace_malloc.c:130) ==19182== by 0x8048385: f (example.c:5) ==19182== by 0x80483AB: main (example.c:11)

The first line tells you that there is an Invalid write in your code and gives you the location of the problem.

The memory leak problem gives this output:

==19182== 40 bytes in 1 blocks are definitely lost in loss record 1 of 1 ==19182== at 0x1B8FF5CD: malloc (vg_replace_malloc.c:130) ==19182== by 0x8048385: f (example.c:5) ==19182== by 0x80483AB: main (example.c:11)

Here the first line reports that memory is "definitely lost," meaning that there must be a memory leak, and gives the location of the problem.

In fact, in addition to "definitely lost," valgrind also gives "probably lost" reports, which mean:

"definitely lost": your program must have a memory leak problem, fix it.

"probably lost": your program looks like it has a memory leak, it is possible that you are using pointers to perform certain operations, so there is not necessarily a 100% problem.

"How to solve the memory leak problem" content is introduced here, thank you for reading. If you want to know more about industry-related knowledge, you can pay attention to the website. Xiaobian will output more high-quality practical articles for everyone!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.