Self-cultivation of embedded C language 09: strong and weak symbols in the process of linking 04/18 Update SLTechnology News&Howtos

Self-cultivation of embedded C language 09: strong and weak symbols in the process of linking

2025-04-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

Attribute declaration: weak

GNU C declares the weak attribute through attribute, which converts a strong symbol to a weak symbol.

The usage is as follows.

Void _ attribute__ ((weak)) func (void); int num _ attribte__ ((weak))

When compiling the source program, the compiler, whether you are a variable name or a function name, in its eyes, is just a symbol, used to represent an address. The compiler will centralize these symbols and store them in a section called the symbol table.

In a software engineering project, there may be multiple source files developed by different engineers. Sometimes it is possible that engineer A defines a global variable num in the A.C source file he is responsible for, and engineer B defines a global variable num with the same name in the B.C source file he is responsible for. So when we print the value of the variable num in the program, which value should we print?

It's time to perform real skills. At this point, it is necessary to analyze this problem with the knowledge of the principle of compilation links. The basic process of compiling links is actually very simple, mainly divided into three stages.

Compilation phase: the compiler compiles each source file into an object file with the .o suffix in units of the source file. Each object file consists of code snippets, data segments, symbol tables, and so on. Link phase: the linker assembles each target file into a large object file. The linker assembles the code segments from each object file together to form a large code segment; the data segments are assembled together to form a large data segment; and the symbol tables are also grouped together to form a large symbol table. Finally, the merged code segments, data segments, symbol tables and so on are combined into a large object file. Relocation: because each object file is reassembled, the address of the variables and functions in each object file has changed, so it is necessary to modify the address of these functions and variables. This process is called relocation. After the relocation, an executable program that can be run on the machine is generated.

In the project exemplified above, there may be problems during the link phase of the compilation process: a variable num with the same name is defined in A.C and B.C files, so which one should the linker use?

At this time, it is necessary to introduce the concepts of strong symbols and weak symbols.

9.2 strong and weak symbols

In a program, whether it is a variable name or a function name, it is just a symbol in the eyes of the compiler. Symbols can be divided into strong symbols and weak symbols.

Strong symbol: function name, initialized global variable name; weak symbol: uninitialized global variable name.

In an engineering project, for the same global variable name and function name, we can generally boil down to the following three scenarios.

Strong symbol + strong symbol + weak symbol + weak symbol

Strong symbols and weak symbols are very useful in solving the conflict of multiple variables and functions with the same name in the process of compiling links. Generally, we follow the following three rules.

There is no room for two tigers in one mountain, the strong and the weak can live together and the big ones win.

For convenience, this is a jingle I made up. The main meaning is: in a project, there can not be two strong symbols at the same time, such as if you define two functions with the same name in a multi-file project, or initialize global variables, then the linker will report a redefined error when linking. However, strong symbols and weak symbols are allowed to exist at the same time in a project. For example, you can define both an initialized global variable and an uninitialized global variable, which can be compiled and passed at compile time. For this kind of symbol conflict of the same name, the compiler will generally choose strong symbols and discard weak symbols when making symbol decisions. Another situation is that in a project, symbols with the same name are all weak symbols, so which one should the compiler choose? Whoever is large, that is, who has more storage space in memory, will choose.

Let's write a simple program to verify the above theory. Define two source files: main.c and func.c.

/ / func.cint a = 1 void func (void) {printf ("main: B =% d\ n", a); printf ("func: b =% d\ n", b);} / / main.cint a void) {printf ("main:a =% d\ n", a); printf ("main: b =% d\ n", b); func (); return 0;}

When you compile the program, you can see the result of the program running.

$gcc-o a.out main.c func.cmain: a = 1main: B = 2func: a = 1func: B = 2

We define two global variables an and b with the same name in main.c and func.c, respectively, but one is a strong symbol and the other is a weak symbol. In the process of linking, the linker will choose a strong symbol when it sees a conflicting symbol of the same name, so you will see that both the main function and the func function print the value of the strong symbol.

Generally speaking, it is not recommended to define multiple different types of weak symbols in a project. There may be a variety of problems during compilation. I will not give an example here. In a project, two strong symbols with the same name cannot be defined at the same time, that is, initialized global variables or functions, otherwise a redefinition error will be reported. But we can use the weak attribute extended by GNU C to convert a strong symbol to a weak symbol.

/ / func.cint a _ _ attribute__ ((weak)) = 1 int main void func (void) {printf ("func:a =% d\ n", a);} / / main.cint a = 4 x void func (void); int main (void) {printf ("main:a =% d\ n", a); func (); void 0;}

When you compile the program, you can see the result of the program running.

$gcc-o a.out main.c func.cmain: a = 4func: a = 4

We use the weak attribute declaration to convert the global variable an in func.c to a weak symbol, and then define a global variable an in main.c and initialize a to 4. The linker selects this strong symbol in main.c when linking, so in both files, the value of the print variable an is 4.

Strong and weak symbols of 9.3 functions

The linker follows the above strength rules for handling conflicts of variables with the same name, and also follows the same rules for function conflicts of the same name. The function name itself is a strong symbol. If you define two functions with the same name in a project, you will definitely report a redefinition error when compiling. However, we can convert one of the functions to a weak symbol through the weak attribute declaration.

/ / func.cint a _ _ attribute__ ((weak)) = 1 printf void _ attribute__ ((weak)) func (void) {printf ("func:a =% d\ n", a);} / / main.cint a = 4 main.cint void func (void) {printf ("void!\ n");} int main (void) {printf ("main:a =% d\ n", a); func (); return 0;}

When you compile the program, you can see the result of the program running.

$gcc-o a.out main.c func.cmain: a = 4func: I am a strong symbol!

In this program example, we redefine a func function with the same name in main.c, and then convert the func () function in the func.c file to a weak symbol through the weak attribute declaration. The linker selects the strong symbol in main.c when linking, so when we call func () in the main function, we actually call the func () function in the main.c file.

9.4 the use of weak symbols

When we refer to a variable or function in a source file, it is generally possible to compile when we only declare it and not define it. This is because compilation is done on a file-by-file basis, and the compiler compiles source files into .o target files first. As long as the compiler can see the declaration of the function or variable, it will think that the definition of the variable or function may be in other files, so it will not report an error. Even if you don't include a header file or even a declaration, the compiler won't report an error. At best, it will give you a warning. But the link phase is to report an error, the linker can not find the definition of this variable or function in each target file or library, so it will generally report an undefined error.

When a function is declared as a weak symbol, there is a peculiarity: when the linker cannot find the definition of the function, it will not report an error. The compiler sets the function name, the weak symbol, to 0 or a special value. Only when the program is running, call this function, jump to the 0 address or a special address will report an error.

/ / func.cint a _ _ attribute__ ((weak)) = 1 void printf main.cint a = 4 void _ attribute__ ((weak)) func (void); int main (void) {printf ("main:a =% d\ n", a); func (); return 0;}

When you compile the program, you can see the result of the program running.

$gcc-o a.out main.c func.cmain: a = 4Segmentation fault (core dumped)

In this sample program, we don't define the func () function, we just make a declaration in main.c and declare it as a weak symbol. When you compile this project, you will find that it can be compiled and passed, but only when the program is running.

In order to prevent the function from running errors, we can make a judgment before running the function, that is, to see if the address of the function name is 0, and then decide whether to call or run. In this way, segment errors can be avoided. The sample code is as follows.

/ / func.cint a _ _ attribute__ ((weak)) = 1printf func main.cint a = 4 void _ attribute__ ((weak)) func (void); int main (void) {printf ("main:a =% d\ n", a); if (func) func (); return 0;}

When you compile the program, you can see the result of the program running.

$gcc-o a.out main.c func.cmain: a = 4

The essence of a function name is an address. Before calling func, we first determine whether it is 0. If it is 0, we will not call it. Just skip it. You will find that with this design, even if the func () function is not defined, our whole project can be compiled, linked, and run normally!

This feature of weak symbols is widely used in library functions. For example, if you are developing a library where the basic functions have been implemented and some advanced functions have not been implemented, you can declare these functions through the weak attribute and convert them to a weak symbol. With this setting, even if the function is not defined, we only need to make a non-zero judgment in the application, which does not affect the operation of our program. After you release the new version of the library, you can implement these advanced functions, and the application does not need any modification, and you can directly run it to call these advanced functions.

Another advantage of weak symbols is that if we are not satisfied with the implementation of library functions, we can customize functions with the same name as library functions to achieve better functionality. For example, the gets () function defined in our C standard library has a loophole and is often the target of stack overflow.

Int main (void) {char a [10]; gets (a); puts (a); return 0;}

The library function gets () defined by the C standard is mainly used to input strings, and one of its Bug is to use carriage returns to determine the end of user input. Such a design can easily cause stack overflows. For example, in the above program, we define an array of 10 characters to store the string entered by the user. When we enter a string with a length greater than 10, a memory error occurs.

Then we define a function with the same name as gets () and call it directly in the main function. The code is as follows.

# include char * gets (char * str) {printf ("hello world!\ n"); return (char *) 0;} int main (void) {char a [10]; gets (a); return 0;}

The running result of the program is as follows.

Hello world!

From the running results, we can see that although we have defined a gets () function with the same name as the C standard library function, the compilation can be passed. When the program calls the gets () function at runtime, it jumps to our custom gets () function to run.

Attribute declaration: alias

GNU C extends an alias attribute, which is simple and is mainly used to define an alias for the function.

Void _ f (void) {printf ("_ f\ n");} void f () _ attribute__ ((alias ("_ f"); int main (void) {f (); return 0;}

The running result of the program is as follows.

_ _ f

Through the alias attribute declaration, we can define an alias f () for the f () function. If we want to call the f () function later, we can call it directly through f ().

In the Linux kernel, you will find that alias is sometimes used with the weak attribute. For example, when some functions are upgraded with the kernel version, the function interface changes. We can use the alias attribute to encapsulate the old interface name and name a new interface.

/ / f.cvoid _ f (void) {printf ("_ f ()\ n");} void f () _ attribute__ ((weak,alias ("_ f"); / / main.cvoid _ attribute__ ((weak)) f (void); void f (void) {printf ("f ()\ n");} int main (void) {f (); return 0;}

When we newly define the f () function in main.c, when we call the f () function in the main function, we will directly call the newly defined function in main.c; when the f () function is not newly defined, the _ f () function will be called.

This tutorial is adapted from the C language embedded Linux Advanced programming Video tutorial No. 05, the electronic version of the book can join the QQ group: 475504428 download, more embedded video tutorials, you can follow:

Official account of Wechat: Otaku tribe (armlinuxfun)

51CTO College-Mr. Wang Litao: http://edu.51cto.com/sd/d344f

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.