In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
In this issue, the editor will bring you about the structure of the PE file. The article is rich in content and analyzed and described from a professional point of view. I hope you can get something after reading this article.
1. The structure of PE file
1. What is an executable file?
An executable file (executable file) is a file that can be loaded and executed by the operating system.
Format of executable file:
-Windows platform: PE (Portable Executable) file structure
-Linux platform: ELF (Executable and Linking Format) file structure
PE and ELF are very similar in that they both derive from the same executable file format COFF
-COFF is a format specification first proposed and used by Unix System V Release 3
-Microsoft developed the PE format standard based on the COFF format and used it in the Windows NT system at that time
-System V Release 4 introduces the ELF format based on COFF.
In fact, on the Windows platform, the object files generated by the VISUAL C++ compiler are still in COFF format, while the executable files are in PE format
Microsoft's PE file structure on the 64-bit Windows platform is called PE32+, which turns the original 32-bit fields into 64-bit fields.
2. Characteristics of PE files
To identify whether a file is a PE file, you should not only look at the file suffix, but also use the PE fingerprint.
Open an exe file with UE and find that the first two bytes of the file are stored in the MZ,0x3C location with an address. Check the address and find that the file is saved with "PE", so you can basically assume that the modified file is a PE file.
Verify that the file is an PE file with these important information ("MZ" and "PE"), which is called PE fingerprint.
3. The overall structure of the PE file
Here, the main part of a PE file is classified as four parts. Here you can have a vague concept first, which will be explained in detail later.
"Section" or "block" or "block" all have the same meaning, which will be interspersed later.
Let's take a look at the structure of a PE file at the binary level as a whole.
4. PE file to memory mapping
The structure of PE files when stored on disk is different from that when loaded into memory.
When the PE file is loaded into memory through the Windows loader, the in-memory version is called a Module.
The starting address of the mapping file is called the module handle (hModule), also known as the base address (ImageBase).
Is the module handle different from other handles? )
File data is generally 512 bytes (1 sector) aligned (now more than 4k), 32-bit memory generally 4k (1 page) aligned, 512D = 200HPower4096D = 1000H
The size of the block in the file is an integer multiple of 200H, and the size of the block in memory is an integral multiple of 1000H. The size of the actual data after mapping remains unchanged, and the excess can be filled with 0.
There is no gap between the PE file header (DOS header + PE header) and the block table, but there is a gap between them and the block, and the size depends on the alignment parameters.
When the VC compiler compiles by default, the base address of the exe file is 0x400000. The base address of the exe file is 0x10000000.
VA: virtual memory address
RVA: relative virtual address, that is, offset address from base address
FOA: file offset address
5. DOS part
The DOS MZ header is actually an IMAGE_DOS_HEADER, which occupies 64 bytes
Typedef struct _ IMAGE_DOS_HEADER {/ / DOS .EXE header WORD estrangement magic; / / Magic number WORD estrangcblp; / / Bytes on last page of file WORD eBay crlc; / / Relocations WORD eBay crcrlc; / / Size of header in paragraphs WORD estrangminalloco; / / Minimum extra paragraphs needed WORD eMaxalloco; / / Maximum extra paragraphs needed WORD ebegssss; / / Initial (relative) SS value WORD ethereal spp; / / Initial SP value WORD e_csum / / Checksum WORD estranged; / / Initial IP value WORD estrangcs; / / Initial (relative) CS value WORD estranglfarlc; / / File address of relocation table WORD estrangovno; / / Overlay number WORD e_res [4]; / / Reserved words WORD estrangoemid; / / OEM identifier (for e_oeminfo) WORD estrangoeminence; / / OEM information; e_oemid specific WORD e_res2 [10]; / / Reserved words LONG e_lfanew / / File address of new exe header} IMAGE_DOS_HEADER, * PIMAGE_DOS_HEADER
DOS headers are used in 16-bit systems, and DOS headers become redundant data in 32-bit systems, but there are also two important member e_magic fields (offset 0x0) and e_lfanew fields (offset 0x3C)
E_magic saves the "MZ" character, and e_lfanew saves the PE file header address, which is used to find the PE file header and get the PE file ID "PE".
E_magic and e_lfanew are important fields for verifying PE fingerprints, and other fields are basically not used (can be filled with arbitrary data)
The data in the "DOS Stub" area is filled by the linker (you can fill in the desired data yourself), which is a small piece of code that can be run under DOS.
The only function of this code is to output a line of words to the terminal: "This program cannot be run in DOS" ("e_cs" and "e_ip" point)
Then exit the program, indicating that the program cannot be run under DOS.
6. PE header (PE Header)
The PE header is an IMAGE_NT_HEADERS32 that also contains two other structures, occupying 4B + 20B + 224B
Typedef struct _ IMAGE_NT_HEADERS {DWORD Signature; / / PE file identifies 4Bytes IMAGE_FILE_HEADER FileHeader; / / 40 Bytes IMAGE_OPTIONAL_HEADER32 OptionalHeader; / / 224 Bytes PE32 executable file, without discussing the case of PE32+} IMAGE_NT_HEADERS32, * PIMAGE_NT_HEADERS32
The Signature field is set to 0x00004550, and the ANCII character is "PE00", which identifies the beginning of the PE header, and the PE logo cannot be broken.
1. IMAGE_FILE_HEADER structure
The IMAGE_FILE_HEADER (image file header or standard PE header) structure, which contains some basic information about the PE file, is called the Standard Common object File format (Common Object File Format,COFF) header in Microsoft's official documentation.
Typedef struct _ IMAGE_FILE_HEADER {what kind of CPU can WORD Machine; / / run on. 0 represents the creation time of any Intel 386 and subsequent: 0x014C, x64: 0x8664 WORD NumberOfSections; / / the number of chunks (sections) of DWORD TimeDateSt / / files. The number of seconds calculated in GMT on January 1, 1970, the unimportant value DWORD PointerToSymbolTable; / / filled by the compiler points to the number of symbols in the symbol table (for debugging) the number of symbols in the symbol table (for debugging) the size of the WORD SizeOfOptionalHeader; / / IMAGE_OPTIONAL_HEADER32 structure, 32 bits for E0 WORD Characteristics; 64 bits for F0 WORD Characteristics; / / file attributes} IMAGE_FILE_HEADER, * PIMAGE_FILE_HEADER
Important field: NumberOfSections,SizeOfOptionalHeader
The corresponding structure is the purple line part of the following figure.
0x014C description running on x86 CPU;0x0007 indicates that the current exe has 7 sections
0x00E0 indicates that IMAGE_OPTIONAL_HEADER32 is 224 bytes.
0x030F (0000 0011 0000 1111) represents file attributes and consists of the following combinations with corresponding bits of 1
2. IMAGE_OPTIONAL_HEADER structure
IMAGE_OPTIONAL_HEADER (optional image header or extended PE header) is an optional structure and is an extension of the IMAGE_FILE_HEADER structure
The size is recorded by the SizeOfOptionalHeader field of the IMAGE_FILE_HEADER structure (may not be accurate)
Typedef struct _ IMAGE_OPTIONAL_HEADER {/ Standard fields. / / WORD Magic; / / description file type PE32:10BH PE32+:20BH Rom image file: 107H BYTE MajorLinkerVersion; / / linker major version number BYTE MinorLinkerVersion; / / linker minor version number DWORD SizeOfCode; / / sum of all code sections (based on file alignment) useless DWORD SizeOfInitializedData; filled by compiler / / total size compiler filled with all initialized data sections useless DWORD SizeOfUninitializedData / / Total size of sections containing uninitialized data the compiler filled useless DWORD AddressOfEntryPoint; / / program entry RVA in most executable files, this address does not directly point to the Main, WinMain or DIMain function, but to the runtime library code and it calls the above function DWORD BaseOfCode; / / code start RVA, and the compiler fills in useless DWORD BaseOfData / / data segment start RVA, the compiler filled useless / NT additional fields. / / DWORD ImageBase; / / memory image base address, set DWORD SectionAlignment; / / memory alignment on your own when you can link it. Generally, one page size is 4k DWORD FileAlignment; / / file alignment is generally 512 bytes per sector size. Now there is also 4k WORD MajorOperatingSystemVersion; / / identifying the operating system version number major version number WORD MinorOperatingSystemVersion; / / identifying the operating system version number minor version number WORD MajorImageVersion; / / PE file's own major version number WORD MinorImageVersion. / / minor version number of PE file itself WORD MajorSubsystemVersion; / / Subsystem major version number WORD MinorSubsystemVersion; / / the value of subsystem minor version number DWORD Win32VersionValue; / / subsystem version required to run must be the mapping size of the entire PE file in 0 DWORD SizeOfImage; / / memory, larger than the actual value, and must be an integral multiple of SectionAlignment DWORD SizeOfHeaders / / all header + section tables according to the size of the file alignment, otherwise loading will error DWORD CheckSum; / / checksum, some system files have requirements. Used to determine whether the file has been modified by the WORD Subsystem; / / subsystem driver (1) graphical interface (2) console, DLL (3) WORD DllCharacteristics; / / file characteristics are not reserved for DWORD SizeOfStackReserve; / / initialization of DLL files stack size DWORD SizeOfStackCommit; / / actual committed size during initialization DWORD SizeOfHeapReserve; / / heap size retained during initialization DWORD SizeOfHeapCommit; / / retained during initialization DWORD LoaderFlags DWORD NumberOfRvaAndSizes; / / number of data catalog items IMAGE_DATA_DIRECTORY DataDirectory [image _ NUMBEROF_DIRECTORY_ENTRIES]; / / data catalog table} IMAGE_OPTIONAL_HEADER32, * PIMAGE_OPTIONAL_HEADER32
Important fields:
AddressOfEntryPoint: program entry address (RVA). The figure below is 32C40H.
ImageBase: memory image base address. The image below is 400000H.
FileAlignment: file alignment, the following figure is 200H
SectionAlignment: memory alignment, the following figure is 1000H
DataDirectory [16]: data catalog table, consisting of several identical IMAGE_DATA_DIRECTORY structures
Point to output table, input table, resource block, relocation table, etc. (skip here later)
Typedef struct _ IMAGE_DATA_DIRECTORY {DWORD VirtualAddress; / / the starting RVA DWORD Size; of a corresponding table / / a corresponding table length} IMAGE_DATA_DIRECTORY, * PIMAGE_DATA_DIRECTORY
ImageBase+AddressOfEntryPoint = the entry address of the actual running of the program (the actual loading address is equal to ImageBase)
0x400000 + 0x32C40 = 0x432C40 (use OD to run the program and find that it starts from this address)
Application: add code in the blank area of the PE file to let the program execute the added code first and then jump to the program entry
Train of thought:
① constructs a piece of code in the blank area of PE (call-> E8)
② changes the entry address to a new code (IMAGE_OPTIONAL_HEADER.AddressOfEntryPoint)
After the new ③ code is executed, jump back to the entry address (jmp-> E9)
7. Block table
The block table is an array of IMAGE_SECTION_HEADER structures with 40 bytes per IMAGE_SECTION_HEADER structure.
Each IMAGE_SECTION_HEADER structure contains information about the block it is associated with, such as location, length, and attributes.
# define IMAGE_SIZEOF_SHORT_NAME 8 typedef struct _ IMAGE_SECTION_HEADER {BYTE name [image _ SIZEOF_SHORT_NAME]; / / block name. Most blocks are named with a "." Start (e. G. Text), this "." The size of the union {DWORD PhysicalAddress; / / commonly used second field DWORD VirtualSize; / / loaded into the actual chunk of memory (before alignment), which is not required, why does it change? It may be that sometimes the uninitialized global variable does not put the bss segment but extends the RVA where the block is loaded into memory (after memory alignment, the value is always an integral multiple of SectionAlignment) DWORD SizeOfRawData; / / the space occupied by the block in the file (after file alignment), the value of VirtualSize may be larger than SizeOfRawData, such as BSS section (SizeOfRawData is 0), data section (key depends on where the uninitialized variables are placed) DWORD PointerToRawData / / the offset of the block in the file (FOA) / * debugging related, ignoring the use of * / DWORD PointerToRelocations; / / in the ".obj" file, and the number of pointers to the relocation table DWORD PointerToLinenumbers; WORD NumberOfRelocations; / / relocation tables (used in OBJ files). Attributes of WORD NumberOfLinenumbers; DWORD Characteristics; / / Block this field is a set of flags indicating block attributes (such as code / data, readable / writable, etc.)} IMAGE_SECTION_HEADER, * PIMAGE_SECTION_HEADER
Important fields: Name [8], VirtualSize,VirtualAddress,SizeOfRawData,PointerToRawData,Characteristics
Does the NumberOfSections field of IMAGE_FILE_HEADER record the number of sections of the current file?
31C80H represents the pre-alignment size of the loaded memory code block; 1000H represents the code block loaded into the memory RVA1000H
31E00H represents the block size of the file alignment offspring; 400H represents the offset of the code block in the file.
60000020H stands for code block attribute ( 0110 0000 0000 0000 0010 0000 ) look up the following table to get the code whose attribute is readable and executable
More attribute reference: https://docs.microsoft.com/zh-cn/windows/win32/api/winnt/ns-winnt-image_section_header
8. The conversion between RVA and FOA
RVA: relative to virtual address, FOA: file offset address.
Calculation steps:
① calculates RVA= virtual memory address-ImageBase
② if RVA is located in PE head: FOA = = RVA
③ determines which section the RVA is in:
RVA > = section .VirtualAddress (section RVA after memory alignment)
RVA
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.