In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces "how to use Ghidra to reverse analyze Go binary program". In daily operation, I believe many people have doubts about how to use Ghidra to reverse analyze Go binary program. Xiaobian consulted all kinds of data and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubt of "how to use Ghidra to reverse analyze Go binary program". Next, please follow the editor to study!
Dynamic allocation string structure
In the first case, the string structure is created at run time, so you need to use a series of assembly instructions to set the corresponding structure before the string operation. Because of the different instruction sets, the structure of different architectures is also different. Let's use a few cases to show the sequence of instructions that our script (find_dynamic_strings.py) looks for.
Dynamic allocation of string structure under x86 architecture
First of all, let's look at the example of "Hello Hacktivity".
Figure 20 dynamic allocation of string structure in hello_go
Figure 21 undefined "hello, hacktivity" string in hello_go
After running the script, the code looks like this:
Figure 22 string structure dynamically allocated in hello_go after find_dynamic_strings.py is executed
As you can see, the string has been defined:
The "hello hacktivity" string has been defined in figure 23 hello_go
At the same time, the string "hacktivity" can also be found in Ghidra's Defined Strings view.
Figure 24 filters strings defined in hello_go through "hacktivity"
Experiments show that our script can find the following instruction sequences in 32-bit and 64-bit x86 binaries:
Figure 25 dynamic allocation of eCh0raix string structure
Figure 26 dynamically allocated string structure in hello_go
Dynamic allocation of strings under ARM architecture
For the 32-bit ARM architecture, we will take the eCh0raix ransomware sample as an example to illustrate the string recovery method.
Figure 27 dynamic allocation of string structure in eCh0raix
Figure 28 pointer to string address in eCh0raix
Figure 29 undefined strings in eCh0raix
After the script is executed, the code looks like this:
Figure 30 after the execution of find_dynamic_strings.py, the string structure is dynamically allocated in eCh0raix
We can see that the pointer has been renamed and the string defined:
Figure 31 A pointer to a string address in eCh0raix after find_dynamic_strings.py is executed
Figure 32 after find_dynamic_strings.py is executed, the string defined in eCh0raix
The script looks for a sequence of instructions that can be issued in a 32-bit ARM binary:
For the 64-bit ARM architecture, we will demonstrate the string recovery method through a Kaiji sample. In this case, the code uses two instruction sequences, but only changes in one sequence:
Figure 33 dynamic allocation of string structure in Kaiji
After the script is executed, the code becomes:
Figure 34 dynamic allocation of string structure in Kaiji after find_dynamic_strings.py is executed
We can see that these strings have been defined:
Figure 35 after find_dynamic_strings.py is executed, the string defined in Kaiji
The script can find the following instruction sequence in the 64-bit ARM binary:
As you can see, this script restores the dynamically assigned string structure. This is very helpful for reverse engineers to read assembly code or to look for suspicious strings in the Defined String view in Ghidra.
The challenges faced by this approach
The biggest drawback of this approach is that each architecture (even a different solution within the same architecture) needs to add a new branch to the script. Moreover, it is easy to circumvent these predefined instruction sets. In the following example, for the Kaiji 64-bit ARM malware sample, the string was missed because the length of the string was moved to a register, which the script did not expect.
Figure 36 Kaiji dynamically allocates string structures in an unusual way
Figure 37 an undefined string in Kaiji
Statically assigned string structure
In the next case, our script (find_static_strings.py) is used to find the statically assigned string structure. This means that the string pointer is followed by the string length.
This is the string pointer and its length found in the x86 eCh0raix blackmail software sample:
Figure 38 statically assigned string structure in eCh0raix
In the above figure, the string pointer is followed by a string length value; however, Ghidra cannot distinguish between address and integer data types, except for the first pointer referenced directly in the code.
Figure 39 string pointer in eCh0raix
Undefined strings can be found by string address:
Figure 40 undefined strings in eCh0raix
After the script is executed, the string address, the string length value, and the string itself are defined:
Figure 41 statically assigned string structure in eCh0raix after find_static_strings.py is executed
Figure 42 after find_static_strings.py is executed, the string defined in eCh0raix
Challenge: eliminate false positives and string omissions
We want to eliminate false positives, and to do this, we need:
Limit the length of a string
Search for printable characters
Search in the data segment of the binary file
Obviously, because of these limitations, strings can easily be left out of the net. If you use this script, feel free to experiment: keep changing these values to find the best settings. Where the following code is used to limit the length and character set:
Fig. 43 find_static_strings.py.
Figure 44 find_static_strings.py
Further challenges in string recovery
Automatic analysis of Ghidra may mistakenly identify certain data types. If this happens, our script will not be able to create the correct data at that particular location. To solve this problem, incorrect data types must be deleted before new data types can be created.
For example, let's first look at the statically assigned string structure in eCh0riax blackmail software.
Figure 45 statically assigned string structure in eCh0raix
Here, the address recognition is correct, but the string length value (which should be an integer data type) is mistakenly identified as an undefined value.
In our script, the following lines of code are used to remove incorrect data types:
Figure 46 find_static_strings.py
After the script is executed, not only are all data types correctly identified, but also all strings are defined:
Figure 47 static allocation of string structure in eCh0raix after find_static_strings.py is executed
Another problem comes from the fact that in Go binaries, strings are concatenated and stored in a large string, blob. In some cases, Ghidra defines the entire blob as a single string. These can be identified by a large number of offcut references. An Offcut reference is a reference to some part of a defined string, not a reference to the starting address of the string-- note that it is a reference to a location within the string.
The following is from the ARM Kaiji sample:
Figure 48 string with Ghidra error definition
Figure 49 Kaiji's offcut reference to an incorrectly defined string
To find the misdefined string, you can use the Defined Strings window in Ghidra to sort the strings by the number of offcut references. Before executing the string recovery script, you can manually undefine large strings with a large number of offcut references. In this way, the script can successfully create the correct string data type.
Figure 50 string defined in Kaiji
Once a string is successfully defined manually or through our script, it can be displayed correctly in the Ghidra list view to help reverse engineers read the assembly code smoothly. However, the decompiler view in Ghidra does not properly handle fixed-length strings, and regardless of the length of the string, it displays everything until an empty character is found. Fortunately, this issue will be resolved in the next version of Ghidra (9.2).
Next, let's take the eCh0raix sample as an example to illustrate this software problem:
Figure 51 eCh0raix's defined string displayed in the Listing view
Figure 52 the defined string displayed by eCh0raix in the Decompile view
Summary
This paper focuses on the solutions to two difficult problems faced by reverse analysis of Go binaries in order to help reverse engineers use Ghidra to statically analyze malware written in Go. Specifically, we first discussed how to restore function names in stripped Go binaries and proposed several solutions for defining strings in Ghidra. The scripts we created and the files used in the examples in this article are public, and you can find them through the link below.
In fact, this is just a small step in the reverse journey of Go binaries. Next, we plan to delve into the calling convention and type system of Go functions.
In Go binaries, parameters and return values are passed to the function through the stack rather than registers, which are currently difficult for Ghidra to detect correctly. Therefore, calling conventions that help Ghidra support Go will help reverse engineers understand the purpose of the functions being analyzed.
Another interesting topic is types in Go binaries. As we have shown by extracting the function name from the file under investigation, the Go binary also stores information about the type used. Restoring these types is of great help to reverse engineering. In the following example, we restore the main.Info structure from a sample of eCh0raix blackmail software. This structure tells us what information the malware wants from the C2 server.
Figure 53 main.info structure in eCh0raix
The main.info field in figure 54 eCh0raix
Figure 55 main.info structure in eCh0raix
At this point, the study on "how to use Ghidra to reverse analyze Go binary program" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.