How to use python to deal with raw audio data 04/21 Update SLTechnology News&Howtos

How to use python to deal with raw audio data

2025-04-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article introduces the relevant knowledge of "how to use python to deal with original audio data". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

I. basic knowledge

PCM (pulse code modulation), that is, pulse code modulation, is a coding system that converts analog signals into digital signals. The analog-to-digital conversion is mainly divided into two steps: first, the continuous analog signal is sampled, and then the sampled data is converted into numerical value, namely quantization.

Let x xx be the input signal and F (x) be the quantized signal, then F (x) can be either linear or nonlinear. In audioop, there are mainly three kinds of coding support, namely, a-Law, μ-Law and ADPCM.

The main practical coding method in China and Europe is A-Law, and its expression is:

Where An AA is the compression coefficient, 87.56 is recommended in the G.726 standard.

ADPCM (Adaptive Differential PCM), that is, adaptive differential PCM.

Because of the continuity of analog signals, generally speaking, the signals of adjacent time units often have high linearity, even almost the same as each other, so they can be compressed efficiently. However, there is also a signal with a large jump amplitude, and if it is based on the principle of slow change, then this part of the data is bound to be lost. In order to balance this difference, adaptive quantization is needed.

The Intel/DVI ADPCM algorithm supported in audioop can be found on the Internet, but the information is not much and it is very old, it does not seem to be very important, even the knowledge net can not be found, so I will not interpret it in detail here.

II. Conversion function

Audioop provides conversion functions between ADPCM, A-Law, and μ-Law and linear sampling

Sampling ADPCMA- Law μ-Lawlin2linlin2adpcmlin2alawlin2ulawadpcm2linalaw2linulaw2lin

Among them, the input parameters of the conversion function related to A-Law and μ-Law are (fragment, width), which represent the segment to be processed and the bit width, respectively, while adpcm takes one more state tuple as the third parameter to represent the encoder state.

Lin2lin is a function that converts linear segments between 1, 2, 3, and 4-byte formats with input parameters (fragment, width, newwidth).

Let's create some new data to test the transcoding function.

# the following code comes from test_audioop.pyimport audioopimport sysimport unittestpack = lambda width, data: b''.join (v.to_bytes (width, sys.byteorder, signed=True) for v in data) packs = {w: (lambda * data, width=w: pack (width, data)) for w in (1,2,3,4)} unpack = lambda width, data: [int.from_bytes (data [I: I + width], sys.byteorder, signed=True) for i in range (0, len (data)) Width)] datas = {1: B'\ X00\ x12\ x45\ xbb\ x7f\ x80\ xff', 2: packs [2] (0, 0x1234, 0x4567,-0x4567, 0x7fff,-0x8000,-1), 3: packs [3] (0, 0x123456, 0x456789,-0x456789, 0x7fffff,-0x800000,-1), 4: packs [4] (0, 0x12345678, 0x456789ab,-0x456789ab, 0x7fffffff,-0x80000000,-1),}

Then the value of datas is:

> for key in datas: print

...

B'\ X00\ x12E\ xbb\ x7f\ x80\ xff'

B'\ X00\ x004\ x12gE\ x99\ xba\ xff\ x7f\ x00\ x80\ xff\ xff'

B'\ X00\ X00\ x00V4\ X12\ x89gEw\ x98\ xba\ xff\ xff\ x7f\ X00\ X00\ x80\ xff\ xff\ xff'

B'\ X00\ X00\ x00xV4\ X12\ xab\ x89gEUv\ x98\ xba\ xff\ X7f\ X00\ X00\ X80\ xff\ xff'

The conversion function is tested as follows:

> datas [1]

B'\ X00\ x12E\ xbb\ x7f\ x80\ xff' # 1-bit linear code to be processed

> unpack (1 dint datas [1])

[0, 18, 69,-69, 127,-128,-1] # convert to integer

# convert 1-byte linear code to 2-byte linear code

> datas1_2 = audioop.lin2lin (datas [1], 1,2)

> print (datas1_2)

B'\ X00\ X00\ x12\ X00E\ X00\ xbb\ x7f\ x00\ x80\ x00\ xff'

> unpack (2 dataset 1 / 2) # is converted to an integer with a value of datas [1] * 256

[0, 4608, 17664,-17664, 32512,-32768,-256]

# convert 1-byte linear code to 1-byte u-Law code

> datas1_u = audioop.lin2ulaw (datas [1], 1)

[- 1,-83,-114, 14,-128, 0,103]

Third, fragment characteristic function

The input of the function in the following table is (fragment, width), which represents the segment and bit width to be counted, respectively.

Return value the mean value of the avg fragment sampling value the average peak and peak value of the avgpp segment sampling value the maximum absolute value of the sampling value of the max segment the maximum peak and peak value of the maxpp sound clip the number of times the root mean square cross segment of the rms segment of the tuple that consists of the minimum and maximum values in the fragment sampling value crosses zero

Getsample (fragment, width, index), as the name implies, is used for sampling and returns the value of the sample value index index in the segment.

Findfactor (fragment, reference) returns a factor F that minimizes rms (add (fragment, mul (reference,-F)), that is, the returned coefficient multiplies reference and best matches fragment. Both clips should contain samples 2 bytes wide.

Findfit (fragment, reference), try to make reference match part of fragment as much as possible.

Findmax (fragment, length), search fragment for all sample slices of length length, the slice with the largest energy, that is, return I to make rms (fragment [I * 2: (i+length) * 2]) maximum.

IV. Fragment operation

The returned values are fragments. In the parameters in the following table, f means fragment,w, width,L means lfactor,R, and rfactor.

Audioop.ratecv (f, w, nchannels, inrate, outrate, state [, weightA [, weightB]])

Can be used to convert the frame rate of the input clip, where

State is a tuple, indicating the status of the converter

WeightA and weightB are parameters for simple digital filters, defaulting to 1 and 0.

This is the end of the content of "how to use python to deal with raw audio data". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.