In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >
Share
Shulou(Shulou.com)05/31 Report--
Today, I will talk to you about the example analysis of manual restoration of python source code with python bytecode, which may not be well understood by many people. in order to make you understand better, the editor has summarized the following content for you. I hope you can get something according to this article.
0x1. Preface
> the Python code is first compiled into bytecode, and then the Python virtual machine executes the bytecode. Python bytecode is an intermediate language similar to assembly instructions. A Python statement corresponds to several bytecode instructions, and the virtual machine executes bytecode instructions one by one, thus completing the program execution. The Python dis module supports disassembly of Python code to generate bytecode instructions. Dis.dis () converts the CPython bytecode into readable pseudocode (similar to assembly code). The structure is as follows:
7 0 LOAD_CONST 1 (0) 3 STORE_FAST 1 (local1) 8 6 LOAD_CONST 2 (101) 9 STORE_GLOBAL 0 (global1) 9 12 LOAD_FAST 1 (local1) 15 PRINT_ITEM 16 LOAD_FAST 0 (arg1) 19 PRINT_ITEM 20 LOAD_GLOBAL 0 (global1) 23 PRINT_ITEM 24 PRINT_NEWLINE 25 LOAD_CONST 0 (None) 28 RETURN_VALUE
In fact, this is the structure:
Source code line number | instruction offset in the function | instruction symbol | instruction parameter | actual parameter value 0x2. Variable 1.const
LOAD_CONST loads const variables, such as numeric values, strings, and so on, which are generally used to pass parameters to the function
55 12 LOAD_GLOBAL 1 (test) 15 LOAD_FAST 0 (2) # read 2 18 LOAD_CONST 1 ('output') 21 CALL_FUNCTION 2
Converting to python code is as follows:
Test (2, 'output') 2. Local variable
LOAD_FAST generally loads the values of local variables, that is, read values, for calculation or function call parameters. STORE_FAST is generally used to save values to local variables.
61 77 LOAD_FAST 0 (n) 80 LOAD_FAST 3 (p) 83 INPLACE_DIVIDE 84 STORE_FAST 0 (n)
The conversion from bytecode to python is as follows:
N = n / p
The formal parameter of a function is also a local variable, how to distinguish it from other local variables? The parameter is not initialized, that is, from the beginning of the function to the location of the LOAD_FAST variable, if you do not see STORE_FAST, then the variable is the function parameter. Other local variables are definitely initialized with STORE_FAST before they are used. Take a look at the following example:
4 0 LOAD_CONST 1 (0) 3 STORE_FAST 1 (local1) 5 6 LOAD_FAST 1 (local1) 9 PRINT_ITEM 10 LOAD_FAST 0 (arg1) 13 PRINT_ITEM 14 PRINT_NEWLINE 15 LOAD_CONST 0 (None) 18 RETURN_VALUE
The corresponding python code is as follows, which can be seen at a glance.
Def test (arg1): local1 = 0 print local1, arg13. Global variable
LOAD_GLOBAL is used to load global variables, including the specified function name, class name, module name and other global symbols. STORE_GLOBAL is used to assign values to global variables.
8 6 LOAD_CONST 2 (101) 9 STORE_GLOBAL 0 (global1) 20 LOAD_GLOBAL 0 (global1) 23 PRINT_ITEM
Corresponding python code
Def test (): global global1 global1 = 101 print global0x3. Common data types 1.list
BUILD_LIST is used to create a list structure.
13 0 LOAD_CONST 1 (1) 3 LOAD_CONST 2 (2) 6 BUILD_LIST 2 9 STORE_FAST 0 (k)
The corresponding python code is:
K = [1,2]
Another common way to create a list is as follows:
[x for x in xlist if Xerox 0]
An example bytecode is as follows:
22 235 BUILD_LIST 0 / / create a list to assign a value to a variable This kind of time is usually grammatical sugar structure: 238 LOAD_FAST 3 (sieve) 241 GET_ITER > 242 FOR_ITER 24 (to 269) 245 STORE_FAST 4 (x) 248 LOAD_FAST 4 (x) 251 LOAD_CONST 2 (0) 254 COMPARE_OP 3 (! =) 257 POP_JUMP_IF_FALSE 2 / / does not meet the condition contine 260 LOAD_FAST 4 (x) / / reads x 263 LIST_APPEND 2 / / stores each x that meets the condition in list 266 JUMP_ABSOLUTE 242 > 269 RETURN_VALUE
The python code is:
[for x in sieve if x! = 0] 2.dict
BUILD_MAP is used to create an empty dict. STORE_MAP is used to initialize the contents of the dict.
13 0 BUILD_MAP 13 LOAD_CONST 1 (1) 6 LOAD_CONST 2 ('a') 9 STORE_MAP 10 STORE_FAST 0 (k)
The corresponding python code is:
K = {'averse: 1}
Then take a look at the bytecode that modifies dict:
14 13 LOAD_CONST 3 (2) 16 LOAD_FAST 0 (k) 19 LOAD_CONST 4 ('b') 22 STORE_SUBSCR
The corresponding python code is:
K ['b'] = 23.slice
BUILD_SLICE is used to create slice. List, tuples, and strings can all be accessed using slice. Note, however, that BUILD_SLICE is used for [x:y:z] this type of slice, reading the value of slice with BINARY_SUBSCR and modifying the value of slice with STORE_SUBSCR. In addition, SLICE+n is used for access of type [a b], and STORE_SLICE+n is used for modification of type of [a b], where n is represented as follows:
SLICE+0 () Implements TOS = TOS [:] .SLICE + 1 () Implements TOS = TOS:] .SLICE + 2 () Implements TOS = TOS1 [: TOS] .SLICE + 3 () Implements TOS = TOS2 [TOS1:TOS].
Let's take a look at specific examples:
13 0 LOAD_CONST 1 (1) 3 LOAD_CONST 2 (2) 6 LOAD_CONST 3 (3) 9 BUILD_LIST 3 12 STORE_FAST 0 (K1) / / K1 = [1,2 3] 14 15 LOAD_CONST 4 (10) 18 BUILD_LIST 1 21 LOAD_FAST 0 (K1) 24 LOAD_CONST 5 (0) 27 LOAD_CONST 1 (1) 30 LOAD_CONST 1 (1) 33 BUILD_SLICE 3 36 STORE_SUBSCR / / K1 [0:1:1] = [10] 15 37 LOAD_CONST 6 (11) 40 BUILD_LIST 1 43 LOAD_FAST 0 (K1) 46 LOAD_CONST 1 (1) 49 LOAD_CONST 2 (2) 52 STORE_SLICE+3 / / K1 [1:2] = [11] 16 53 LOAD_FAST 0 (K1) 56 LOAD_CONST 1 (1) 59 LOAD_CONST 2 (2) 62 SLICE+3 63 STORE_FAST 1 (a) / a = K1 [1:2] 17 66 LOAD_FAST 0 (K1) 69 LOAD_CONST 5 (0) 72 LOAD_CONST 1 (1) 75 LOAD_ CONST 1 (1) 78 BUILD_SLICE 3 81 BINARY_SUBSCR 82 STORE_FAST 2 (b) / / b = K1 [0: 1:1] 0x4. Cycle
SETUP_LOOP is used to start a loop. 35 in SETUP_LOOP 26 (to 35) represents the loop exit point.
While cycle 23 0 LOAD_CONST 1 (0) 3 STORE_FAST 0 (I) / / to 0 24 6 SETUP_LOOP 26 (to 35) > > 9 LOAD_FAST 0 (I) / / cycle start 12 LOAD_CONST 2 (10) 15 COMPARE_OP 0 (> 34 POP_BLOCK > > 35 LOAD_CONST 0 (None))
The corresponding python code is:
I = 0 while I
< 10: i += 1for in结构 238 LOAD_FAST 3 (sieve)#sieve是个list 241 GET_ITER //开始迭代sieve >> 242 FOR_ITER 24 (to 269) / / continue iter the next x 245 STORE_FAST 4 (x)... 266 JUMP_ABSOLUTE 24 / / cycle
This is a typical for+in structure, and the conversion to python code is:
For x in sieve:0x5.if
POP_JUMP_IF_FALSE and JUMP_FORWARD are generally used for branch judgment jump. POP_JUMP_IF_FALSE means that if the conditional result is FALSE, you will jump to the target offset instruction. JUMP_FORWARD jumps directly to the target offset instruction.
23 0 LOAD_CONST 1 (0) 3 STORE_FAST 0 (I) / iS0 24 6 LOAD_FAST 0 (I) 9 LOAD_CONST 2 (5) 12 COMPARE_OP 0 (> 26 LOAD_FAST 0 (I) 29 LOAD_CONST 2 (5) 32 COMPARE_OP 4 (>) 35 POP_JUMP_IF_FALSE 46 27 38 LOAD_CONST 4 ('I > 5') 41 PRINT_ITEM 42 PRINT_NEWLINE 43 JUMP_FORWARD 5 (to 51) 29 > > 46 LOAD_CONST 5 ('I = 5') 49 PRINT_ITEM 50 PRINT_NEWLINE > > 51 LOAD_CONST 0 (None)
The python code is:
I = 0 if I
< 5: print 'i < 5' elif i >5: print'i > 5' else: print'i = 5'0x6. Resolution function 1. Function range
The second column shows the offset address of the instruction in the function, so we can see that 0 is the start of the function, and the instruction before the next 0 is the end of the function. Of course, you can also use RETURN_VALUE to determine the end of the function.
54 0 LOAD_FAST 1 (plist) / / function start 3 LOAD_CONST 0 (None) 6 COMPARE_OP 2 (= =) 9 POP_JUMP_IF_FALSE 3355... 67 > 139 LOAD_FAST 2 (fs) 142 RETURN_VALUE70 0 LOAD_CONST 1 ('FLAG') / / another function starts 3 STORE_FAST 0 (flag) 2. Function call
Function calls are similar to push+call 's assembly structure, where stack parameters are pushed from left to right (not push, of course, but read the instruction LOAD_xxxx to specify parameters). The function name is usually specified by the LOAD_GLOBAL instruction, and if it is a module function or a class member function, it is specified through LOAD_GLOBAL+LOAD_ATTR. First specify the function to be called, then press the parameters, and finally call through CALL_FUNCTION. The value after CALL_FUNCTION indicates that there are several parameters. Support for nested calls:
60 LOAD_GLOBAL 0 (int) / / int function 3 LOAD_GLOBAL 1 (math) / / math Module 6 LOAD_ATTR 2 (sqrt) / / sqrt function 9 LOAD_FAST 0 (n) / / Parameter 12 CALL_FUNCTION 1 15 CALL_FUNCTION 1 18 STORE_FAST 2 (nroot)
The conversion of this bytecode to python code is
Nroot = int (math.sqrt (n)) / / where n is a local variable or function parameter, depending on the context 0x7. Other instructions
Other common instructions can be understood at a glance, but will not be analyzed in detail. Please refer to the official documentation for more details.
INPLACE_POWER () Implements in-place TOS = TOS1 * * TOS.INPLACE_MULTIPLY () Implements in-place TOS = TOS1 * TOS.INPLACE_DIVIDE () Implements in-place TOS = TOS1 / TOS when from _ future__ import division is not in effect.INPLACE_FLOOR_DIVIDE () Implements in-place TOS = TOS1 / / TOS.INPLACE_TRUE_DIVIDE () Implements in-place TOS = TOS1 / TOS when from _ future__ import division is in effect.INPLACE_MODULO () Implements in-place TOS = TOS1% TOS .InPlace _ ADD () Implements in-place TOS = TOS1 + TOS.INPLACE_SUBTRACT () Implements in-place TOS = TOS1-TOS.INPLACE_LSHIFT () Implements in-place TOS = TOS1 > TOS.INPLACE_AND () Implements in-place TOS = TOS1 & TOS.INPLACE_XOR () Implements in-place TOS = TOS1 ^ TOS.INPLACE_OR () Implements in-place TOS = TOS1 | TOS.
The basic operation also has a set of corresponding BINARY_xxxx instructions, and the difference between the two is very simple.
I + = 1 / / using INPLACE_xxxi = I + 1 / / after reading the above content using BINARY_xxxx, do you have any further understanding of the example analysis of manually restoring python source code with python bytecode? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.