The method of dynamically allocating and releasing memory in PHP 07/06 Update SLTechnology News&Howtos

The method of dynamically allocating and releasing memory in PHP

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article mainly explains the "PHP dynamic allocation and release of memory methods", the content of the article is simple and clear, easy to learn and understand, the following please follow the editor's ideas slowly in depth, together to study and learn "PHP dynamic allocation and release of memory" bar!

I. memory

In PHP, populating a string variable is fairly simple, requiring only a statement "", and the string can be freely modified, copied, and moved. In C, although you can write a simple static string such as "char * str =" hello world ";", you cannot modify the string because it exists in the program space. To create a string that can be manipulated, you must allocate a block of memory and copy its contents through a function such as strdup ().

The copy code is as follows:

{

Char * str

Str = strdup ("hello world")

If (! str) {

Fprintf (stderr, "Unable to allocate memory!")

}

For various reasons that we will examine later, traditional memory management functions (such as malloc (), free (), strdup (), realloc (), calloc (), and so on) can hardly be used directly by PHP source code.

2. Release memory

On almost all platforms, memory management is achieved through a request and release mode. First, an application requests the layer below it (usually the "operating system"): "I want to use some memory space." If there is available space, the operating system will provide it to the program and mark it so that this part of memory will not be allocated to other programs.

When the application finishes using this part of memory, it should be returned to OS; so that it can continue to be allocated to other programs. If the program does not return this part of the memory, OS cannot know whether the memory is no longer used and then allocated to another process. If a block of memory is not freed and the owner application loses it, we say that the application is "vulnerable" because this part of the memory is no longer available to other programs.

In a typical client application, small and infrequent memory leaks can sometimes be "tolerated" by OS because the leaked memory is implicitly returned to OS when the process ends later. This doesn't matter, because OS knows which program it allocates the memory to, and it can be sure that the memory is no longer needed when the program terminates.

For long-running server daemons, including web servers like Apache and extended php modules, processes are often designed to run for a long time. Because OS cannot clean up memory usage, any program leak-no matter how small-will result in repetitive operations and eventually depletion of all system resources.

Now, consider the stristr () function in user space; to find a string using a case-insensitive search, it actually creates a small copy of each of the two strings and then performs a more traditional case-sensitive search to find the relative offset. However, after locating the offset of the string, it no longer uses these lowercase versions of the string. If it does not release these copies, every script that uses stristr () will leak some memory each time it is called. Finally, the web server process will have all the system memory, but will not be able to use it.

You can confidently say that the ideal solution is to write good, clean, consistent code. This is certainly good; however, in an environment like the PHP interpreter, this view is only half true.

III. Error handling

In order to "jump out" an active request for a user-space script and its dependent extension functions, you need to use a way to "jump out" of an active request completely. This is done within the Zend engine: a "jump out" address is set at the beginning of a request, and then a longjmp () is executed on any die () or exit () call or when any critical error (E_ERROR) is encountered to jump to that "jump out" address.

Although this "pop-out" process simplifies the process of program execution, in most cases it means skipping parts of the resource cleanup code (such as free () calls) and eventually resulting in memory vulnerabilities. Now, let's consider the following simplified version of the engine code that handles function calls:

The copy code is as follows:

Void call_function (const char * fname, int fname_len TSRMLS_DC) {

Zend_function * fe

Char * lcase_fname

/ * PHP function name is case-insensitive

* to simplify the positioning of them in the function table

* all function names are implicitly translated into lowercase

* /

Lcase_fname = estrndup (fname, fname_len)

Zend_str_tolower (lcase_fname, fname_len)

If (zend_hash_find (EG (function_table), lcase_fname, fname_len + 1, (void * *) & fe) = = FAILURE) {

Zend_execute (fe- > op_array TSRMLS_CC)

} else {

Php_error_docref (NULL TSRMLS_CC, E_ERROR, "Call to undefined function:% s ()", fname)

}

Efree (lcase_fname)

}

When the line php_error_docref () is executed, the internal error handler understands that the error level is critical, and calls longjmp () accordingly to interrupt the current program flow and leave the call_function () function, or even to the line efree (lcase_fname) at all. You may want to move the line of efree () code to the top of the line of zend_error () code; but what about the line of code that calls this call_function () routine? Fname itself is probably an assigned string, and you can't release it at all until it is used up by error message processing.

Note that the php_error_docref () function is an internal equivalent implementation of the trigger_error () function. Its first parameter is an optional document reference that will be added to docref. The third parameter can be any of the familiar family constants of the Ehammer *, which is used to indicate the severity of the error. The fourth parameter (the last) follows the formatting of printf () style and the variable parameter list style.

4. Zend memory Manager

One way to resolve memory leaks during the "pop-out" request above is to use the Zend memory management (ZendMM) layer. This part of the engine is very similar to the memory management behavior of the operating system-allocating memory to callers. The difference is that it is very low in process space and is "request-aware" so that when a request ends, it can perform the same behavior that OS does when a process terminates. That is, it implicitly frees all the memory used for the request. Figure 1 shows the relationship between ZendMM and OS and PHP processes.

Figure 1.Zend memory manager replaces system calls to achieve memory allocation for each request.

In addition to providing implicit memory cleanup, ZendMM can also control the use of each memory request according to the settings of memory_limit in php.ini. If a script attempts to request more memory than is available in the system, or larger than the maximum amount it should request each time, ZendMM will automatically issue an E_ERROR message and start the corresponding "jump out" process. An added advantage of this approach is that the return value of most memory allocation calls does not need to be checked because failure will result in an immediate jump to the exit part of the engine.

The principle of "hooking" PHP internal code with the actual memory management of OS is not complicated: all internally allocated memory is implemented using a specific set of optional functions. For example, instead of using malloc (16) to allocate a 16-byte block of memory, the PHP code uses emalloc (16). In addition to implementing the actual memory allocation task, ZendMM marks the memory block with the appropriate binding request type, so that when a request "pops out", ZendMM can implicitly free it.

Often, memory needs to be allocated for a longer period of time than a single request. This type of allocation, called "permanent allocation" because it still exists after the end of a request, can be achieved using a traditional memory allocator, because these allocations do not add the additional information used by ZendMM corresponding to each request. Sometimes, however, it is not determined until run time whether a particular allocation requires a permanent allocation, so ZendMM exports a set of help macros that behave like other memory allocation functions, but use the last extra parameter to indicate whether it is permanent.

If you do want to achieve a permanent allocation, this parameter should be set to 1; in this case, the request is passed through the traditional malloc () allocator family. However, if the runtime logic assumes that the block does not need to be permanently allocated; then this parameter can be set to zero and the call will be adjusted to the memory allocator function for each request.

For example, pemalloc (buffer_len,1) will be mapped to malloc (buffer_len), and pemalloc (buffer_len,0) will be mapped to emalloc (buffer_len) using the following statement:

# define in Zend/zend_alloc.h:

# define pemalloc (size, persistent) ((persistent)? malloc (size): emalloc (size))

All of these allocator functions provided in ZendMM can find their more traditional counterparts in the following table.

Table 1 shows each allocator function supported by ZendMM and their e/pe corresponding implementation:

Form 1. Traditional allocators are specific to PHP.

The corresponding allocator function e/pe implements void * malloc (size_t count); void * emalloc (size_t count); void * pemalloc (size_t count,char persistent); void * calloc (size_t count); void * ecalloc (size_t count); void * pecalloc (size_t count,char persistent); void * realloc (void * ptr,size_t count); void * erealloc (void * ptr,size_t count)

Void * perealloc (void * ptr,size_t count,char persistent); void * strdup (void * ptr); void * estrdup (void * ptr); void * pestrdup (void * ptr,char persistent); void free (void * ptr); void efree (void * ptr)

Void pefree (void * ptr,char persistent)

You may notice that even the pefree () function requires the use of permanent flags. This is because when pefree () is called, it doesn't really know whether ptr is a permanent allocation. Calling free () on a non-permanent allocation can result in double space release, while calling efree () on a permanent allocation may cause a segment error because the memory manager tries to find management information that does not exist. Therefore, your code needs to remember whether the data structure it allocates is permanent.

In addition to the core part of the allocator function, there are other very convenient ZendMM-specific functions, such as:

Void * estrndup (void * ptr,int len)

This function allocates len+1 bytes of memory and copies len bytes from ptr to the newly allocated block. The behavior of the estrndup () function can be roughly described as follows:

The copy code is as follows:

Void * estrndup (void * ptr, int len)

{

Char * dst = emalloc (len + 1)

Memcpy (dst, ptr, len)

Dst [len] = 0

Return dst

}

Here, the NULL byte implicitly placed at the end of the buffer ensures that any function that uses estrndup () to copy a string does not have to worry about passing the resulting buffer to a function such as printf () that wants to think of NULL as a Terminator. When you use estrndup () to copy non-string data, the last byte is essentially wasted, but the benefits obviously outweigh the disadvantages.

Void * safe_emalloc (size_t size, size_t count, size_t addtl)

Void * safe_pemalloc (size_t size, size_t count,size_t addtl,char persistent)

The final amount of memory allocated by these functions is ((size*count) + addtl). You can ask, "Why provide extra functions? why not use an emalloc/pemalloc?" The reason is simple: for safety. Although sometimes quite unlikely, it is this "highly unlikely" result that leads to a memory overflow on the host platform. This may result in the allocation of a negative number of bytespace, or even a smaller bytespace than is required by the caller. Safe_emalloc (), on the other hand, can avoid this type of trap by checking for integer overflows and explicitly ending with such an overflow.

Note that not all memory allocation routines have a corresponding p* peer implementation. For example, pestrndup () does not exist, and safe_pemalloc () does not exist before PHP version 5.1.

Fifth, reference count

Careful memory allocation and release has a significant impact on the long-term performance of PHP, which is a multi-request process; however, this is only half the problem. In order for a server that processes thousands of clicks per second to run efficiently, each request needs to use as little memory as possible and minimize unnecessary data replication operations. Consider the following PHP code snippet:

The copy code is as follows:

$a = 'Hello World'

$b = $a

Unset ($a)

? >

After the first call, only one variable is created, and a 12-byte block of memory is assigned to it to store the string "Hello World", including a NULL character at the end. Now, let's look at the next two lines: $b is set to the same value as the variable $a, and then the variable $an is released.

If PHP copies the contents of the variable for each variable assignment, an additional 12 bytes need to be copied for the string to be copied in the above example, and additional processor loads are required during data replication. This behavior seems a bit absurd at first glance, because when the third line of code appears, the original variable is released, making the entire data copy completely unnecessary. In fact, let's go a little further and imagine what happens when the contents of an 10MB-sized file are loaded into two variables. This will take up 20MB space, and at this point, 10 is enough. Would the engine waste so much time and memory on such a useless effort?

You should know that the designers of PHP have long known this.

Remember, in the engine, variable names and their values are actually two different concepts. The value itself is an unnamed zval* bank (in this case, a string value), which is assigned to the variable $a through zend_hash_add (). What happens if both variable names point to the same value?

The copy code is as follows:

{

Zval * helloval

MAKE_STD_ZVAL (helloval)

ZVAL_STRING (helloval, "Hello World", 1)

Zend_hash_add (EG (active_symbol_table), "a", sizeof ("a"), & helloval, sizeof (zval*), NULL)

Zend_hash_add (EG (active_symbol_table), "b", sizeof ("b"), & helloval, sizeof (zval*), NULL)

}

At this point, you can actually look at $an or $b, and you will see that they all contain the string "Hello World". Unfortunately, next, you continue to execute the third line of code, "unset ($a);". At this point, unset () doesn't know that the data pointed to by the $a variable is being used by another variable, so it just blindly frees up the memory. Any subsequent access to the variable $b will be parsed as free memory space and thus cause the engine to crash.

This problem can be solved with the help of refcount, the fourth member of zval (which has several forms). When a variable is first created and assigned, its refcount is initialized to 1 because it is assumed to be used only by the corresponding variable when it was originally created. When your code snippet starts assigning helloval to $b, it needs to increase the value of refcount to 2; so now the value is referenced by two variables:

The copy code is as follows:

{

Zval * helloval

MAKE_STD_ZVAL (helloval)

ZVAL_STRING (helloval, "Hello World", 1)

Zend_hash_add (EG (active_symbol_table), "a", sizeof ("a"), & helloval, sizeof (zval*), NULL)

ZVAL_ADDREF (helloval)

Zend_hash_add (EG (active_symbol_table), "b", sizeof ("b"), & helloval,sizeof (zval*), NULL)

}

Now, when unset () deletes the corresponding copy of $an of the original variable, it can see from the refcount parameter that others are interested in the data; therefore, it should just reduce the count of refcount and then leave it alone.

VI. Write copy (Copy on Write)

It's a good idea to save memory through refcounting, but what happens when you want to change the value of only one of the variables? To do this, consider the following code snippet:

The copy code is as follows:

$a = 1

$b = $a

$b + = 5

? >

Through the above logical process, you certainly know that the value of $an is still equal to 1, while the value of $b will eventually be 6. And at this point, you know, Zend is trying to save memory-by making both $an and $b reference the same zval (see second line of code). So what happens when you get to the third line and you have to change the value of the $b variable?

The answer is that Zend wants to check the value of refcount and make sure that it is separated when its value is greater than 1. In the Zend engine, separation is the process of breaking a reference pair, which is the opposite of what you just saw:

The copy code is as follows:

Zval * get_var_and_separate (char * varname, int varname_len TSRMLS_DC)

{

Zval * * varval, * varcopy

If (zend_hash_find (EG (active_symbol_table), varname, varname_len + 1, (void**) & varval) = = FAILURE) {

/ * variable does not exist at all-exit due to failure * /

Return NULL

}

If ((* varval)-> refcount < 2) {

/ * varname is the only actual reference

* No need for separation

, /

Return * varval

}

/ * otherwise, copy the value of zval* * /

MAKE_STD_ZVAL (varcopy)

Varcopy = * varval

/ * copy any assigned structures within the zval* * /

Zval_copy_ctor (varcopy)

/ * Delete the old version of varname

* this will reduce the value of the refcount of varval in the process

* /

Zend_hash_del (EG (active_symbol_table), varname, varname_len + 1)

/ * initialize the reference count of the newly created value and attach it to the

* varname variable

* /

Varcopy- > refcount = 1

Varcopy- > is_ref = 0

Zend_hash_add (EG (active_symbol_table), varname, varname_len + 1), sizeof (zval*), NULL)

/ * return a new zval* * /

Return varcopy

}

Now that the engine has a zval* owned only by the variable $b (which the engine knows), it can convert this value to a long value and add 5 to it as requested by the script.

Write changes (change-on-write)

The introduction of the concept of reference counting also leads to a new possibility of data manipulation, and its form has something to do with "references" from the perspective of the user-space script manager. Consider the following user space code snippet:

The copy code is as follows:

$a = 1

$b = & $a

$b + = 5

? >

In the above PHP code, you can see that the value of $an is now 6, although it starts with 1 and has never changed (directly). This happens because when the engine starts to increase the value of $b by 5, it notices that $b is a reference to $an and thinks, "I can change this value without having to separate it, because I want all reference variables to see this change."

But how does the engine know? Quite simply, it just looks at the fourth and last element (is_ref) of the zval structure. This is a simple on / off bit that defines whether the value is actually part of a user space style reference set. In the previous code snippet, when the first line is executed, the value created for $a results in a refcount of 1 and an is_ ref value of 0, because it is owned by only one variable ($a) and no other variables make write reference changes to it. In the second line, the refcount element of this value is increased to 2, except this time the is_ref element is set to 1 (because the script contains a "&" symbol to indicate a full reference).

Finally, in the third line, the engine once again takes out the value associated with the variable $b and checks to see if it is necessary to separate. This time the value is not separated because a check is not included. The following is part of the code related to refcount checking in the get_var_and_separate () function:

The copy code is as follows:

If ((* varval)-> is_ref | | (* varval)-> refcount < 2) {

/ * varname is the only actual reference

* or it is a full reference to other variables

* either way: there is no separation.

* /

Return * varval

}

This time, although the refcount is 2, the separation is not implemented because the value is a full reference. The engine is free to modify it without paying attention to changes in the values of other variables.

VIII. The problem of separation

Although the replication and referencing techniques discussed above already exist, there are still some problems that cannot be solved by is_ref and refcount operations. Consider the following block of PHP code:

The copy code is as follows:

$a = 1

$b = $a

$c = & $a

? >

Here, you have a value that needs to be associated with three different variables. Two of the variables are fully referenced by "change-on-write", while the third variable is in a detachable "copy-on-write" (write copy) context. What values would work if only is_ref and refcount were used to describe this relationship?

The answer is: none of them can work. In this case, this value must be copied into two separate zval*, although both contain exactly the same data (see figure 2).

Figure 2. Force detach when referencing

Similarly, the following code block will cause the same conflict and force the value to separate out a copy (see figure 3).

Figure 3. Force detach during replication

The copy code is as follows:

$a = 1

$b = & $a

$c = $a

? >

Note that in both cases, $b is associated with the original zval object, because the engine cannot know the name of the third variable in between when the detach occurs.

Thank you for reading, the above is the content of "PHP dynamic allocation and release of memory", after the study of this article, I believe you have a deeper understanding of the method of dynamic allocation and release of memory of PHP, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.