Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the performance optimizations of PHP7

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly introduces "what performance optimization does PHP7 have". In daily operation, I believe many people have doubts about what performance optimization problems PHP7 has. The editor consulted all kinds of data and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubts about "what performance optimization PHP7 has". Next, please follow the editor to study!

I. New features and changes

1. Scalar type and return type declaration (Scalar Type Declarations & Scalar Type Declarations)

A very important feature of the PHP language is "weak typing", which makes PHP programs very easy to write, and beginners can get started quickly when they come into contact with PHP, but it also comes with some controversy. Support for variable type definition can be said to be an innovative change, and PHP began to support type definition in an optional way. In addition, a switch instruction declare (strict_type=1) is introduced; once this instruction is turned on, it will force the program in the current file to follow strict function argument types and return types.

For example, an add function with a type definition can be written like this:

If combined with the mandatory type switch instruction, it can be like this:

If strict_type,PHP is not enabled, it will try to convert to the required type, but when it is enabled, it will change the PHP and no longer do type conversion, and an error will be thrown if the type does not match. For students who like "strongly typed" language, this is a great blessing.

More detailed introduction: PHP7 scalar type declaration RFC [translation]

two。 More Error becomes Exception that can be captured

PHP7 implements a global throwable interface, which is implemented by the original Exception and some Error (interface), and defines the exception inheritance structure in the way of interface. As a result, more Error in the PHP7 becomes a trapable Exception returned to the developer, Error if not captured, and an Exception that can be processed within the program if captured. These Error that can be captured are usually Error that will not cause fatal harm to the program, for example, the function is not saved. PHP7 is further convenient for developers to deal with, giving developers more control over the program. Because by default, Error will directly cause program interruptions, while PHP7 provides the ability to capture and process, allowing the program to continue execution and providing programmers with more flexible choices.

For example, to execute a function that we are not sure exists, the PHP5-compatible approach is to append the function_exist before the function is called, while PHP7 supports the way to capture the Exception.

The example in the following figure (screenshot is from within PPT):

3. AST (Abstract Syntax Tree, abstract syntax tree)

AST acts as a middleware in the PHP compilation process, replacing the original way of spitting opcode directly from the interpreter and decoupling the interpreter (parser) from the compiler (compliler), which can reduce some Hack code and make the implementation easier to understand and maintainable.

PHP5:

PHP7:

More AST information: https://wiki.php.net/rfc/abstract_syntax_tree

4. Native TLS (Native Thread local storage, native thread local storage)

PHP in multithreaded mode (for example, woker and event mode of Web server Apache, that is, multithreading), needs to solve the problem of "thread safety" (TS,Thread Safe). Because threads share the memory space of the process, each thread itself needs to build a private space to store its own private data in some way to avoid mutual pollution with other threads. The way adopted by PHP5 is to maintain a large global array, allocate an independent storage space for each thread, and the thread accesses the global data set through its own key value.

This unique key value needs to be passed to every function that needs to use global variables in PHP5. PHP7 believes that this way of passing is not friendly and there are some problems. Therefore, try to use a global thread-specific variable to hold the key value.

Related Native TLS problem: https://wiki.php.net/rfc/native-tls

5. Other new features

There are a lot of new features and changes in PHP7, and we don't have all of them here to talk about in detail.

Int64 supports unifying the length of integers on different platforms. Strings and file uploads can be greater than 2GB.

Uniform variable syntax (Uniform variable syntax).

Foreach showed consistent behavior (Consistently foreach behaviors)

New operator,?

Unicode character format support (\ u {xxxxx})

Anonymous class support (Anonymous Class)

... ...

Second, leapfrog performance breakthrough: full speed forward

1. JIT and performance

Just In Time (just-in-time compilation) is a software optimization technique that compiles bytecode to machine code only at run time. Intuitively, it is easy to assume that machine code can be directly recognized and executed by a computer and is more efficient than Zend reading opcode one by one. Among them, HHVM (HipHop Virtual Machine,HHVM is an open source PHP virtual machine of Facebook) uses JIT to improve their PHP performance testing by an order of magnitude, releasing a shocking test result, and also makes us intuitively think that JIT is a powerful technology that turns stone into gold.

In fact, in 2013, Bird and Dmitry (one of the developers of the PHP kernel) made a JIT attempt (without distribution) on the version of PHP5.5. The original execution flow of PHP5.5 is to compile the PHP code into opcode bytecode through lexical and syntactic analysis (the format is a bit like assembly), and then the Zend engine reads these opcode instructions and parses them one by one.

They introduce type inference (TypeInf) after the opcode session, then generate ByteCodes through JIT, and then execute it.

As a result, exciting results were obtained in benchmark (test program), and the performance after implementing JIT was 8 times higher than that of PHP5.5. However, when they put this optimization into the actual project WordPress (an open source blog project), they saw little improvement in performance and got a puzzling test result.

Therefore, they use the profile type tool under Linux to analyze the CPU time-consuming of program execution.

Distribution of CPU consumption for performing WordPress 100 times (screenshot from PPT):

Notes:

21%CPU time is spent on memory management.

12%CPU time is spent on hash table operations, mainly adding, deleting, modifying and querying the PHP array.

30%CPU time is spent on built-in functions, such as strlen.

25%CPU time is spent on VM (Zend engine).

After analysis, two conclusions are drawn:

(1) if the ByteCodes generated by JIT is too large, it will cause the decrease of CPU cache * * rate (CPU Cache Miss).

In PHP5.5 code, because there is no obvious type definition, it can only be inferred by type. Define the variable types that can be inferred as far as possible, and then, combined with type inference, remove the branch code that is not of this type and generate directly executable machine code. However, type inference cannot infer all types. In WordPress, less than 30% of the type information can be inferred, and the branch code that can be reduced is limited. After causing the JIT, the machine code is generated directly, and the generated ByteCodes is too large, resulting in a significant reduction in the CPU cache (CPU Cache Miss).

CPU cache * * means that when CPU reads and executes instructions, if the required data cannot be read in the first-level CPU cache (L1), it will have to continue to search until the second-level cache (L2) and the third-level cache (L3), and eventually try to find the required instruction data in the memory area, and the read time gap between memory and CPU cache can reach 100 times. Therefore, if the ByteCodes is too large and the number of instructions executed is too large, the multi-level cache cannot hold so much data, and some instructions will have to be stored in the memory area.

The cache size at all levels of CPU is also limited. The following figure shows the configuration information of Intel i7 920:

As a result, the decline in the CPU cache * * rate can lead to a significant increase in time, which, on the other hand, is offset by the performance improvement brought about by JIT.

With JIT, the overhead of VM can be reduced, and at the same time, through instruction optimization, the development of memory management can be indirectly reduced because the number of memory allocations can be reduced. However, for real WordPress projects, only 25% of CPU time is spent on VM, and the main problem and bottleneck is not actually VM. As a result, JIT's optimization plan, * *, is not included in this version of the PHP7 feature. However, it is likely to be implemented in a later version, which is also worth looking forward to.

(2) the improvement of JIT performance depends on the actual bottleneck of the project.

JIT has been greatly improved in benchmark because the amount of code is relatively small, the resulting ByteCodes is also relatively small, and the main overhead is in VM. However, there is no significant performance improvement in the actual WordPress project, because the code amount of WordPress is much larger than that of benchmark. Although JIT reduces the overhead of VM, it results in a decrease in CPU cache and additional memory overhead because ByteCodes is too large, resulting in no improvement.

Different types of projects will have different proportion of CPU overhead, and will get different results. The performance test which is divorced from the actual project is not very representative.

2. Changes in Zval

Various types of variables of PHP, in fact, the real storage carrier is Zval, which is characterized by tolerance. In essence, it is a struct of C language implementation. For students who write PHP, you can roughly understand it as something similar to an array array.

PHP5's Zval, which occupies 24 bytes of memory (screenshot from PPT):

PHP7's Zval, which occupies 16 bytes of memory (screenshot from PPT):

Zval dropped from 24 bytes to 16 bytes, why did it drop? here we need to add a little bit of C language foundation to help students who are not familiar with C understand. Struct is a little different from union (federation). Each member variable of Struct occupies a separate piece of memory space, while the member variable of union shares a piece of memory space (that is, if you modify one of the member variables, the public space will be modified, and the records of other member variables will be lost). As a result, although there seems to be a lot more member variables, the actual memory space occupied is reduced.

In addition, there are features that have been significantly changed, and some simple types no longer use references.

Zval structure diagram (from PPT):

The Zval in the figure consists of 2 64bits (1 byte = 8 bit is "bit"). If the variable type is long or bealoon, which is no longer longer than 64bit, it will be directly stored in value without the following reference. When the variable type is array, objec, string, etc., which exceeds 64bit, value stores a pointer that points to the address of the real storage structure.

For simple variable types, Zval storage becomes very simple and efficient.

Types that do not require references: NULL, Boolean, Long, Double

Types to be referenced: String, Array, Object, Resource, Reference

3. Internal type zend_string

Zend_string is the structure of the actual stored string, and the actual content is stored in val (char, character type), while val is an char array with a length of 1 (convenient for member variables to occupy places).

Structure * A member variable uses a char array instead of using char*,. Here is a small optimization technique that can reduce the cache miss of CPU.

If the char array is used, when malloc applies for internal storage of the above structure, it is applied in the same area, usually with a length of sizeof (_ zend_string) + actual char storage space. However, if only a pointer is stored in that location of char*, the real storage is in a separate area of memory.

Memory allocation comparison using char [1] and char*:

From the perspective of logical implementation, there is not much difference between the two, and the effect is very similar. In fact, when these memory blocks are loaded into CPU, they look very different. Because the former is the same block of memory allocated continuously, it can usually be obtained together when CPU reads (because it will be in the same level of cache). For the latter, because it is two blocks of memory data, when CPU reads * * block memory, it is very likely that the second block of memory data is not in the same level of cache, so that CPU has to look below L2 (second level cache), or even find the desired second block of memory data in the memory area. This causes CPU Cache Miss, and the time-consuming of the two can be 100 times different.

In addition, when copying strings, use reference assignments, memory copies that zend_string can avoid.

6. Changes in the PHP array (HashTable and Zend Array)

In the process of writing PHP programs, the most frequently used type is the array, and the array of PHP5 is implemented by HashTable. In a rough summary, it is a HashTable that supports two-way linked lists, not only supports hash mapping access elements through the array's key, but also traverses array elements by accessing two-way linked lists through foreach.

HashTable of PHP5 (screenshot from PPT):

This diagram looks very complex, with all kinds of pointers jumping around, and when we access the content of an element through a key value, it sometimes takes three pointer jumps to find the right content. The most important point is that these array elements are stored in different memory areas. By the same token, when CPU is read, because they are probably not in the same level of cache, the CPU will have to look in the lower cache or even the memory area, that is, it will cause the CPU cache to decrease, thus increasing the time consuming.

Zend Array of PHP7 (screenshot from PPT):

The new version of the array structure is very simple and eye-catching. The characteristic of * is that the array elements of the whole block and the hash mapping table are all joined together and allocated in the same block of memory. If you are traversing an array of simple types of integers, it will be very efficient, because the array elements (Bucket) themselves are continuously allocated in the same block of memory, and the zval of the array elements will store the integer elements internally, there is no longer a pointer chain, and all the data is stored in the current memory area. Of course, the most important thing is that it can avoid CPU Cache Miss (CPU cache * * rate decline).

Changes in Zend Array:

The value of the array defaults to zval.

The size of HashTable decreased from 72 to 56 bytes, a reduction of 22%.

The size of the Buckets decreased from 72 to 32 bytes, a decrease of 50%.

The memory space of the Buckets of the array elements is allocated together.

The key (Bucket.key) of the array element points to zend_string.

The value of the array elements is embedded in the Bucket.

Lower CPU Cache Miss.

7. Function call mechanism (Function Calling Convention)

PHP7 improves the function calling mechanism, reduces some instructions and improves the execution efficiency by optimizing the link of parameter transmission.

PHP5's function call mechanism (screenshot from PPT):

In the figure, the instructions send_val and recv parameters in the vm stack are the same, and PHP7 achieves the underlying optimization of the function calling mechanism by reducing these two duplications.

PHP7's function call mechanism (screenshot from PPT):

8. Through macro definition and inline function (inline), let the compiler finish part of the work ahead of time.

The macro definition of C language will be executed in the preprocessing stage (compilation stage), part of the work will be completed ahead of time, and there is no need to allocate memory when the program is running, so it can achieve similar functions, but there is no stack overhead of function calls. It will be more efficient. The inline function is also similar, in the preprocessing stage, the function in the program is replaced with the function body, and the real running program is executed here, there will be no overhead of function calls.

PHP7 has done a lot of optimization in this respect, putting a lot of work that needs to be done at run time into the compilation phase. For example, parameter type judgment (Parameters Parsing), because what is involved here is a fixed character constant, so it can be completed in the compilation phase, thus improving the efficiency of subsequent execution.

For example, the way the parameter type is passed in the following figure is optimized from the writing on the left to the macro on the right.

At this point, the study on "what are the performance optimizations of PHP7" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report