Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to understand the MFCC algorithm of speech signal

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

Today, I will talk to you about how to understand the voice signal MFCC algorithm, which may not be well understood by many people. In order to make you understand better, the editor has summarized the following content for you. I hope you can get something according to this article.

For voice processing, you can use an integrated toolkit, such as voicebox.

Add the voicebox toolkit to toolbox in Matlab7. Official download address, enter in command

> addpath (genpath ('E:\ soft\ Matlab\ toolbox\ voicebox')) > > savepath

The path writes the location of the voicebox in its own PC.

# # MFCC process # # MFCC is such a process:

# # data reading with Matlab # #

[X turan.wav', [1018 1400]')

The return value of x is an array of 382'2, indicating that it is the left and right channels. This statement reads the audio file named turan.wav, stores the audio data in x, and returns it to the user. The range of audio data is [- 1], which is normalized.

In C, the voice data is read into an one-dimensional array, so it is necessary to separate the left and right channels. In order to track debug data, Matlab is also replaced with mono.

Xchangx (:, 1);% get left channel data

# # C to achieve pre-emphasis # #

Void PreEmphasise (float * s, float k) {int ibot float preE;// accentuation coefficient preE = kscape for (ibomipframesizebomachi > = 2 imuri -) s [I]-= s [I-1] * float [1] * = 1.0 Muir pree;}

The method of preweighting in time domain is much like difference, which acts as a high-pass filter in frequency domain. The Z transformation of the transfer function of S2 (n) = S (n)-aS (n Mel 1) is H (z) = 1 mi a (Z ^ (- 1)). By drawing its frequency spectrum, we can directly see that this is a high-pass filter to enhance the formant of the speech signal. Specific knowledge about Z transformation can be found in Oppenheimer's "Signals & Systems" P534. If you don't understand, do a few signal and system exercises.

# # framing and windowing # # We regard the voice signal as smooth, and the actual speech signal may be very long. In order to facilitate the processing of the entire signal, only a small piece of data (10ms~30ms) is processed at a time. This is framing, and there is a partial overlap between frames in order to ensure the continuity of the signal.

The direct truncation of the signal (adding a rectangular window) will produce frequency leakage, which is usually a hamming window. The hamming window is very similar to a sine function, and the image drawn with the Matlab command plot (hamming (100)) is as follows:

The role of FFT

FFT is used for filtering in general applications. First, the positive transform filters out part of the spectral components, and then the inverse transform is used to change the signal back to achieve the purpose of filtering. But FFT has another function, for example, in this application, the FFT transform is performed on each frame, the spectrum is calculated and the amplitude spectrum is obtained. The amplitude spectrum obtained is used in the following Mel scale triangular filter banks.

Looking at the FFT algorithm this time, it really feels like reliving the old dream. After re-examining the previously learned signals and systems and digital signal processing, in fact, these things are really good.

After reading the above, do you have any further understanding of how to understand the speech signal MFCC algorithm? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report