Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the GCC parameters?

2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

What are the differences of GCC parameters? I believe many inexperienced people are at a loss about this. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.

Most programs and libraries are compiled with a default optimization level of "2" (using the gcc option: "- O2") and compiled according to the i386 processor by default on the Intel/AMD platform.

If you only want the compiled program to run on a specific platform, you need to implement more advanced compiler optimization options to produce code that can only run on a specific platform.

One way is to modify the Makefile file in each source package, look for the CFLAGS and CXXFLAGS variables (the compilation options for the C and C++ compilers), and modify its value.

Some source packages such as binutils, gcc, glibc, etc., have Makefile files in each subfolder, so it is too tiring to modify!

Another easy way is to set the CFLAGS and CXXFLAGS environment variables. Most configure scripts use these two environment variables instead of the values in the Makefile file.

But a few configure scripts don't do this, and they have to be edited manually.

To set the CFLAGS and CXXFLAGS environment variables, you can execute the following command in bash (you can also write in .bashrc to be the default):

Export CFLAGS= "- O3-march=" & & CXXFLAGS=$CFLAGS

This is a minimum setting that ensures that it works on almost all platforms.

The "- march" option indicates that binaries are compiled for a specific cpu type (cannot be run on lower-level cpu)

Intel is usually: pentium2, pentium3, pentium3m, pentium4, pentium4m, pentium-m, prescott, nocona

Description: pentium3m/pentium4m is used for notebook mobile P3According to P4bot PentiumMurm is the cpu of Centrino I/II generation notebook.

Prescott is the P4 with SSE3 (famous for boiling hot enough to fry eggs); nocona is the latest P4 with EMT64 (64-bit) (you can also fry eggs)

AMD is usually: K6, K6-2, K6-3, athlon, athlon-tbird, athlon-xp, athlon-mp, opteron, athlon64, athlon-fx

People who use AMD are usually DIYer, so there is no need to explain it.

If you don't complain about "segmentation fault, core dumped" at compile time, then the "- O" optimization parameter you set is generally fine.

Otherwise, please lower the optimization level ("- O3"-> "- O2"-> "- O1"-> cancel).

Personal opinion: the server can use "- O2", which is the safest optimization parameter (collection); desktops can use "- O3"

The use of too many custom optimization options is discouraged, but there is no obvious difference in speed between them (sometimes "- O3" is slower).

Compilers are very sensitive to hardware, especially when using a high level of optimization, the slightest memory error can lead to fatal failure.

So please don't overclock your computer when compiling (I always reduce the frequency when I compile key programs).

Note: the order of options is important, and if two options conflict with each other, the latter shall prevail.

For example, "- O3" will turn on the-finline-functions option, but you can use "- O3-fno-inline-functions" to both use the function of-O3 and turn off the function embedded function.

For more optimization options, see:

Http://gcc.gnu.org/onlinedocs/gcc-3.4.4/gcc/Optimize-Options.html

Http://gcc.gnu.org/onlinedocs/gcc-3.4.4/gcc/i386-and-x86_002d64-Options.html

Http://gcc.gnu.org/onlinedocs/gcc-4.0.2/gcc/Optimize-Options.html

Http://gcc.gnu.org/onlinedocs/gcc-4.0.2/gcc/i386-and-x86_002d64-Options.html

For a complete list of all GCC options, see:

Http://gcc.gnu.org/onlinedocs/gcc-3.4.4/gcc/Option-Summary.html

Http://gcc.gnu.org/onlinedocs/gcc-4.0.2/gcc/Option-Summary.html

There are two page denomination references:

(for gentoo-1.4) more secure optimization options

Http://www.freehackers.org/gentoo/gccflags/flag_gcc3.html

(for gentoo-1.4) Advanced Optimization options

Http://www.freehackers.org/gentoo/gccflags/flag_gcc3opt.html

*

Oh, I forgot to mention that "- O2" has enabled most of the safe optimization options, so you don't have to worry about that bunch of options.

First of all, let's talk about the addition of "- O3" to "- O2". You can add it as needed (it's safe):

[gcc-3.4.4]

-finline-functions allows the compiler to select some simple functions to expand where they are called

-fweb allocates a pseudo register for each web structure

-frename-registers attempts to remove false dependencies from the code, which is effective for machines with a large number of registers.

[gcc-4.0.2]

-finline-functions instructions as above

-funswitch-loops moves variables that do not change the value in the body of the loop outside the body of the loop

-fgcse-after-reload * * do not quite understand its meaning * * [if anyone in Daxia knows to explain it to my younger brother, thank you in advance]

After saying "- O3", let's talk about the "- Os" option commonly used in embedded systems, this option is actually very important, it means to optimize the size of the generated binary code, it turns on all the "- O2" on options, so it is generally believed that the potential consciousness of inefficient execution of binary code generated by "- Os" is wrong! Of course, the difference between this option and "- O2" is that it forbids all spaces inserted for alignment on the basis of "- O2", that is, all options in the "- falign-*" series are disabled. Whether this ban necessarily reduces the execution efficiency of the code varies according to the program. It is said that in some cases, the efficiency of "- Os" is 14% higher than that of "- O3"! Let the brothers grope for themselves in practice.

-

The following is a brief introduction to [gcc-3.4.4] that I think is more important. The complete list of GCC options is too long! Limited energy.

[note] all the options listed here are non-default, you just need to add the options you need.

-w disables the output of warning messages

-Werror converts all warnings to errors

-Wall displays all warning messages

-v displays the current version number of the compiler

-V specifies the version that gcc will run. It works only on machines with multiple versions of gcc installed.

-ansi compiles programs according to the ANSI standard, but does not restrict GNU extensions that do not conflict with the standard (this option is generally not used)

-pedantic if you want to restrict that your code must strictly conform to the ISO standard, enable this option on top of "- ansi" (rarely used)

-std= specifies the standard of the C language (c89Gnu89), which forbids the extension keyword asm,typeof,inline of GNU C (generally not used)

The-static connector ignores the dynamic link library and parses all references by including the static object file directly into the result object file.

The-shared connector will generate shared object code, which can be dynamically connected to the program at run time to form a complete executable.

If you use the gcc command to create a shared library as its output, this option prevents the connector from treating the missing main () method as an error.

In order to work correctly, all shared target modules that make up the same library should be compiled consistently using the option "- fpic" and the target platform option.

-shared-libgcc this option specifies that a shared version of libgcc is used, which is not valid on machines that do not have a shared version of libgcc.

The specs= gcc driver reads the file to determine which options should be passed to those child processes.

This option overrides the default configuration by specifying a configuration file, which is processed after the default profile is read to modify the default configuration.

-pipe can speed up compilation by using pipes instead of temporary files to exchange output from one phase to another. Recommended.

-o specifies the output file, which is valid for all kinds of output. Because only one file can be specified, do not use this option when multiple output files are generated.

-- help displays a list of command-line options for gcc; when used with "- v", it also displays the options accepted by each process called by gcc.

-- target-help displays a list of command line options related to the target machine

-b indicates the target machine on which the compiler needs to be compiled; the default is the target machine compilation code on which the compiler runs.

The target machine is determined by specifying the directory that contains the compiler, usually / usr/local/lib/gcc-lib//

-B specifies the location of the library file, including the compiler, executor, and data files, which will be located with this prefix if you need to run a subroutine (such as cpp,as,ld).

This prefix can be multiple paths separated by colons, and the environment variable GCC_EXEC_PREFIX has the same effect as this option.

-I specifies the directory where the system header files are searched, and multiple directories can be specified repeatedly using this option.

-dumpmachine displays the target machine name of the program and does nothing else.

-dumpspecs displays specification information for the component compiler, including all the options used to compile, assemble, and connect to the gcc compiler itself, without doing anything else.

-dumpversion displays the version number of the compiler itself and does nothing else

-falign-functions=N puts the starting addresses of all functions in N (Noble 1, 2, 4, 8, 16...) , defaults to the default value of the machine itself, and specifying 1 means that alignment is disabled.

-falign-jumps=N will branch the target in N (Numbai 1, 2, 4, 8, 16...) , defaults to the default value of the machine itself, and specifying 1 means that alignment is disabled.

-fno-align-labels recommends using it to ensure that it does not conflict with-falign-jumps (the option enabled by "- O2" by default)

-fno-align-loops recommends using it to ensure that redundant null instructions are not inserted in front of the branch target.

-fbranch-probabilities after compiling the program with the "- fprofile-arcs" option and executing it to create a file containing the number of times each block of code is executed, the program can compile again using this option

The information generated in the file will be used to optimize branch code that occurs frequently. Without this information, gcc will guess which branch might occur frequently and optimize it.

This type of optimization information will be stored in a file with the name of the source file and the suffix ".da".

-fno-guess-branch-probability by default gcc will use a random model to guess which branches are more likely to be executed frequently to optimize the code, which option turns it off.

-fprofile-arcs after compiling the program with this option and running it to create a file containing the number of executions for each code block, the program can compile again using "- fbranch-probabilities"

The information in the file can be used to optimize branches that are often selected. Without this information, gcc will guess which branches will be run frequently for optimization.

This type of optimization information will be stored in a file with the name of the source file and the suffix ".da".

-fforce-addr must copy addresses to registers to operate on them. Since the required address is usually loaded into the register earlier, this option can improve the code.

-fforce-mem must copy the values to the register to operate on them. Since the required values are usually loaded into the register earlier, this option can improve the code.

Programs compiled by ffreestanding can be run in a stand-alone environment without standard libraries and without starting with the main () function.

This option sets "- fno-builtin" and is equivalent to "- fno-hosted".

The program compiled by fhosted needs to run in the host environment, where there needs to be a complete standard library, and the main () function has an int return value.

-fno-builtin does not recognize all built-in functions unless referenced by "_ _ builtin_".

-fmerge-all-constants attempts to combine all constant values and numbers across compilation units into one copy. However, the standard C _ plumber + requires that each variable must have a different storage location.

-fmove-all-movables moves all immutable expressions outside the body of the loop, depending on the loop structure in the source code.

-the code generated by fnon-call-exceptions can be used for trap instructions (such as illegal floating point operations and illegal memory addressing) to throw exceptions, which requires runtime support from the relevant platforms, and is not generally effective.

-fomit-frame-pointer does not save pointers in registers for functions that do not need stack pointers, so you can ignore the code to store and retrieve addresses and use registers for general purposes.

All "- O" levels have an option on, but only if the debugger can run without stack pointers. It is recommended that you explicitly set it without debugging.

-fno-optional-diags forbids the output of diagnostic messages, which are not required by the C++ standard.

-fpermissive outputs diagnostic messages in the code that do not conform to the standard as warnings rather than errors.

-fpic generates location-independent code (PIC) that can be used for shared libraries, and all memory addressing is done through the global offset table (GOT). This option is not valid on all machines.

To determine an address, you need to insert the memory location of the code itself as an item in the table. This option produces target modules that are stored in the shared library and loaded from it.

-fprefetch-loop-arrays generates array pre-read instructions, which can speed up code execution for programs that use large arrays, and are suitable for database-related large-scale software.

-freg-struct-return generates code that returns short structures with registers, and memory will be used if the registers cannot be accepted.

-fstack-check does the necessary detection to prevent the program stack from overflowing, and it may only be needed when running in a multithreaded environment.

-display statistics of compilation time after ftime-report compilation is completed

-funroll-loops if you can determine at compile time that there are very few iterations and very few instructions in the loop, you can use this option to expand the loop to drive out the loop and copy instructions.

-finline-limit= will not expand the compiler for functions with more than the number of pseudo instructions. The default is 600.

-- param = gcc there are some internal restrictions on the degree of optimization code, and adjusting these restrictions is to adjust the overall optimization situation. The name of the parameter and the corresponding explanation are listed below:

Name interpretation

A larger number of max-delay-slot-insn-search results in more optimized code, but slows down the compilation speed, which defaults to 100

A larger number of max-delay-slot-live-search results in more optimized code, but slows down the compilation speed, which defaults to 333

The maximum amount of memory used by max-gcse-memory to perform GCSE optimization. Too small will make the optimization impossible. The default is 50m.

Maximum number of iterations for max-gcse-passes to perform GCSE optimization. Default is 1.

*

After talking about the command line options, let's talk about the settings related to hardware architecture (mainly cpu) [for i386/x86_64 only]

The most famous "- march" has already been mentioned above, but let's talk about something else (just pick some practical ones)

-cpu support at levels above mfpmath=sse P3 and athlon-tbird

-masm= uses the specified dialect to output assembly language instructions, which can be "intel" or "att"; default is "att"

-mieee-fp specifies that the compiler uses IEEE floating-point comparisons, which will correctly handle situations where the comparison results are unordered.

-malign-double aligns double, long double, and long long to double-byte boundaries; helps generate faster code, but the size of the program becomes larger.

-m128bit-long-double specifies that long double is 128bit. Cpu above pentium prefers this standard.

-mregparm=N specifies the number of registers used to pass integer arguments (no registers are used by default). 0

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report