Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to expand the data of Python Audio

2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

Today, I will talk to you about how to expand the data of Python audio. Many people may not know much about it. In order to make you understand better, the editor has summarized the following contents for you. I hope you can get something according to this article.

The classical deep learning network AlexNet uses data expansion (Data Augmentation) to expand the data set and achieves better classification results. In the image field of deep learning, the data is expanded by means of translation, flipping, adding noise and so on. However, in the field of Audio, how to expand the data?

There are four main ways to expand audio data:

Audio clipping (Clip)

Audio rotation (Roll)

Audio tuning (Tune)

Audio noise (Noise)

Audio analysis is based on librosa audio library; matrix operation is based on scipy and numpy scientific computing library.

Here is how Python is implemented:

Audio clipping

Import librosafrom scipy.io import wavfiley, sr = librosa.load (".. / data/love_illusion.mp3") # read audio print y.shape, srwavfile.write (".. / data/love_illusion_20s.mp3", sr, y [20 * sr:40 * sr]) # write audio

Audio rotation

Import librosaimport numpy as npfrom scipy.io import wavfiley, sr = librosa.load (".. / data/raw/love_illusion_20s.mp3") # read audio y = np.roll (y, sr*10) print y.shape, srwavfile.write (".. / data/raw/xxx_roll.mp3", sr, y) # write audio

Audio tuning, note: the resize function of cv library contains interpolation function.

Import cv2import librosafrom scipy.io import wavfiley, sr = librosa.load (".. / data/raw/love_illusion_20s.mp3") # read audio ly = len (y) y_tune = cv2.resize (y, (1, int (len (y) * 1.2)). Squeeze () lc = len (y_tune)-lyy_tune = y _ tune [lc / 2): int (lc / 2) + ly] print y.shape Srwavfile.write (".. / data/raw/xxx_tune.mp3", sr, y_tune) # write audio

Audio noise, note: when adding random noise, keep the value of 0, otherwise harsh!

Import librosafrom scipy.io import wavfileimport numpy as npy, sr = librosa.load (".. / data/raw/love_illusion_20s.mp3") # read audio wn = np.random.randn (len (y)) y = np.where (y! = 0.02 * wn, 0.0) # noise is not added to 0! Print y.shape, srwavfile.write (".. / data/raw/love_illusion_20s_w.mp3", sr, y) # write audio to read the above content, do you have any further understanding of how to expand the data of Python audio? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report