Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use struct to deal with binary in Python

2025-03-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article shows you how to use struct to deal with binary in Python, the content is concise and easy to understand, it will definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.

The three most important functions in the struct module are pack (), unpack (), and calcsize ()

Pack (fmt, v1, v2,...) Encapsulates the data as a string (actually a stream of bytes similar to the c structure) in the given format (fmt)

Unpack (fmt, string) parses the byte stream string according to the given format (fmt), and returns the parsed tuple

Calcsize (fmt) calculates how many bytes of memory are consumed by a given format (fmt)

The formats supported in struct are as follows:

Number of FormatC TypePython bytes xpad byteno value1ccharstring of length 11bsigned charinteger1Bunsigned charinteger1?_Boolbool1hshortinteger2Hunsigned shortinteger2iintinteger4Iunsigned intinteger or long4llonginteger4Lunsigned longlong4qlong longlong8Qunsigned long longlong8ffloatfloat4ddoublefloat8schar [] string1pchar [] string1Pvoid * long

Note 1.Q and Q are only interesting when the machine supports 64-bit operations.

Note 2. Each format can be preceded by a number, indicating the number.

Note 3.s format represents a string of a certain length, 4s represents a string of length 4, but p represents a pascal string

Note 4.P is used to convert a pointer whose length is related to the machine word length.

Note 5. The last one that can be used to represent the pointer type is 4 bytes.

In order to exchange data with the structure in c, it is also necessary to consider that some c or C++ compilers use byte alignment, usually a 32-bit system in 4 bytes, so struct is converted according to the local machine byte order. You can change the alignment with the first character in the format. The definition is as follows:

CharacterByte orderSize and alignment@nativenative is enough for 4 bytes = nativestandard by original number of bytes big-endianstandard by original number of bytes! network (= big-endian)

Standard by original byte

The way to use it is to put it in the first place of fmt, like'@ 5s6sif'

Example 1:

For example, there is a structure

Struct Header

{

Unsigned short id

Char [4] tag

Unsigned int version

Unsigned int count

}

Received one of the above structure data through socket.recv, stored in the string s, now need to parse it, you can use the unpack () function.

Import struct

Id, tag, version, count = struct.unpack ("! H4s2I", s)

In the format string above! It means that we want to use network byte order parsing, because our data is received from the network and is network byte order when transmitted on the network. The following H represents a unsigned short id,4s represents a 4-byte string, and 2 I represents two unsigned int-type data.

Through a unpack, now we have saved our information in id, tag, version, count.

Similarly, it is also very convenient to re-pack local data into struct format.

Ss = struct.pack ("! H4s2I", id, tag, version, count)

The pack function converts id, tag, version, count into the structure Header,ss according to the specified format. Now it is a string (actually a stream of bytes similar to the c structure), which can be sent through socket.send (ss).

Example 2:

Import struct

Axiom 12.34

# change a to binary

Bytes=struct.pack ('iPrefecture a)

At this point, bytes is a string string, which stores the same content as the binary of an in bytes.

Then do the reverse operation.

The existing binary data bytes, which is actually a string, converts it back to the data type of python:

A recently constructed struct.unpack ('iPrecious bytes)

Notice that unpack returns tuple

So if there is only one variable:

Bytes=struct.pack ('iPrefecture a)

So, you need to decode it like this.

(a,) = struct.unpack ('iCandle _ bytes)

If you use a=struct.unpack directly, then a = (12.34,) is a tuple rather than the original floating-point number.

If it is made up of multiple data, you can do this:

A couple of hellos

Baked worldly

Cymb2

Dice 45.123

Bytes=struct.pack ('5s6sif recorder, a recorder, breco, c, d)

At this point, bytes is binary data, which can be directly written to a file such as binfile.write (bytes).

Then, we can read it again when we need it, bytes=binfile.read ()

Then decode it to python variable by struct.unpack ()

_

'5s6sif'is called fmt, which is a formatted string, which is made up of numbers plus characters, 5s represents a string of five characters, 2i, represents two integers, and so on. Here are the available characters and types. Ctype means that it can correspond to the types in python one by one.

Note: problems encountered in binary file processing

When we use to process binaries, we need to use the following methods

Binfile=open (filepath,'rb') reads binaries

Binfile=open (filepath,'wb') writes binaries

So how is the result different from that of binfile=open (filepath,'r')?

There are two differences:

First, if you encounter '0x1A' when using'r', it will be regarded as the end of the file, which is EOF. Using 'rb' does not have this problem. That is, if you write in binary and then read out with text, if there is a '0X1A' in it, only part of the file will be read. When using 'rb', it will be read all the way to the end of the file.

Second, for the string x='abc\ ndef', we can use len (x) to get a length of 7.\ nWe call it a newline character, which is actually '0X0A'. When we write in'w', that is, text, we automatically change '0X0A' into two characters on the windows platform, that is, the file length actually becomes 8. When read with'r 'text, it is automatically converted to the original newline character. If you change it to 'wb' binary, it will keep one character unchanged and read as it is. So if you write in text and read in binary, consider the extra byte.' 0X0D' is also called carriage return. It won't change under linux. Because linux only uses' 0X0A'to indicate line breaks.

The above is how to use struct to deal with binary in Python. Have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report