In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
Based on how to realize speech and word processing based on Python PaddleSpeech, this article introduces the corresponding analysis and answer in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.
Environmental installation
First, let's take a look at the project structure and installation documentation.
Need Python3.7 or above, C++ environment, requirements installation and so on, I will say it in my order below.
1. Conda installs Python3.9 virtual environment
Use conda to install the python3.9 environment with the following command.
Conda create-n py39 python=3.9
2. Install Visual Studio 2019
Installation address: Microsoft C++ Generation tool-Visual Studio
Note that C++ desktop development needs to be checked when installing.
3. Install requirements.txt
Use the command to install requiremets.txt, as follows:
Pip install-r requirements.txt-I https://pypi.douban.com/simple
Note here that it doesn't matter if the paddlespeech_ctcdecoders installation fails, it can be omitted.
4. Install paddlepaddle and paddlespeech
The command is as follows:
Pip install paddlepaddle-I https://mirror.baidu.com/pypi/simplepip install paddlespeech-I https://pypi.tuna.tsinghua.edu.cn/simple
5. Download nltk_data
Follow the instructions in the project installation documentation.
My local directory address is as follows
Project verification
Let me verify the tts, asr and punctuation recovery functions respectively.
Tts speech synthesis
Use the command as follows:
Paddlespeech tts-- input "Nanjing is very cold now. Let's go to the Confucius Temple next time." -- output C:\ Users\ xxx\ Desktop\ 115.wav
Execution process
(dh_partner) D:\ spyder\ PaddleSpeech > paddlespeech tts-- input "Nanjing is very cold now. Let's go to Confucius Temple next time." -- output C:\ Users\ xxx\ Desktop\ 115.wavphones_dict: None [2022-01-05 17V 23 xxx 43642] [INFO] [log.py] [L57]-File C:\ Users\ huyi\ .paddlespeech\ models\ fastspeech3_csmsc-zh\ fastspeech3_nosil_baker_ckpt_0.4.zip md5 checking... [2022-01-05 17V 23V 44742] [INFO] [log.py] [L57]-Use pretrained model stored In: C:\ Users\ huyi\ .paddlespeech\ models\ fastspeech3_csmsc-zh\ fastspeech3_nosil_baker_ckpt_0.4self.phones_dict: C:\ Users\ huyi\ .paddlespeech\ models\ fastspeech3_csmsc-zh\ fastspeech3_nosil_baker_ckpt_0.4\ phone_id_map.txt [2022-01-05 17ghizua44743] [log.py] [L57]-C:\ Users\ huyi\ .paddlespeech\ models\ fastspeech3_csmsc-zh \ fastspeech3_nosil_baker_ckpt_0.4 [2022-01-05 1723 log.py 44744] [INFO] [log.py] [L57]-C:\ Users\ huyi\ .paddlespeech\ models\ fastspeech3_csmsc-zh\ fastspeech3_nosil_baker_ckpt_0.4\ default.yaml [2022-01-05 17viscous 23displacement 44744] [INFO] [L57]-C:\ Users\ huyi\ .paddlespeech\ models\ fastspeech3_csmsc -zh\ fastspeech3_nosil_baker_ckpt_0.4\ snapshot_iter_76000.pdzself.phones_dict: C:\ Users\ huyi\ .paddlespeech\ models\ fastspeech3_csmsc-zh\ fastspeech3_nosil_baker_ckpt_0.4\ phone_id_map.txt [2022-01-05 1715 17purv 23mov 44745] [INFO] [log.py] [L57]-File C:\ Users\ huyi\ .paddlespeech\ models\ pwgan_csmsc-zh\ pwg_baker_ckpt_0 .4. Zip md5 checking... [2022-01-05 17 log.py 23 Frev 44782] [INFO] [log.py] [L57]-Use pretrained model stored in: C:\ Users\ huyi\ .paddlespeech\ models\ pwgan_csmsc-zh\ pwg_baker_ckpt_0.4 [2022-01-05 1723 log.py 44783] [INFO] [L57]-C:\ Users\ huyi\ .paddlespeech\ models\ pwgan_csmsc-zh\ Pwg_baker_ckpt_0.4 [2022-01-05 17V 23V 44783] [INFO] [log.py] [L57]-C:\ Users\ huyi\ .paddlespeech\ models\ pwgan_csmsc-zh\ pwg_baker_ckpt_0.4\ pwg_default.yaml [2022-01-05 17VR 23V 24785] [INFO] [L57]-C:\ Users\ huyi\ .paddlespeech\ models\ pwgan_csmsc-zh\ Pwg_baker_ckpt_0.4\ pwg_snapshot_iter_400000.pdzvocab_size: 268frontend roomencoderroomtype is transformerdecoder_type is transformerC:\ Users\ huyi\ .conda\ envs\ dh_partner\ lib\ site-packages\ paddle\ framework\ io.py:415: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from' collections.abc' is deprecated since Python 3.3 And in 3.10 it will stop working if isinstance (obj, collections.Iterable) and not isinstance (obj [2022-01-05 17:23:51] [DEBUG] [_ _ init__.py:113] Building prefix dict from the default dictionary... Loading model from cache C:\ Users\ huyi\ AppData\ Local\ Temp\ jieba.cache [2022-01-05 17:23:51] [DEBUG] [_ _ init__.py:132] Loading model from cache C:\ Users\ huyi\ AppData\ Local\ Temp\ jieba.cacheLoading model cost 0.659 seconds. [2022-01-05 17:23:52] [DEBUG] [_ _ init__.py:164] Loading model cost 0.659 seconds.Prefix dict has been built successfully. [2022-01-05 17:23:52] [DEBUG] [_ _ init__.py:166] Prefix dict has been built successfully.C:\ Users\ huyi\ .conda\ envs\ dh_partner\ lib\ site-packages\ paddle\ fluid\ dygraph\ math_ Op_patch.py:251: UserWarning: The dtype of left and right variables are not the same Left dtype is paddle.int64, but right dtype is paddle.int32, the right dtype will convert to paddle.int64 warnings.warn ([2022-01-05 17purse 2315 58811] [INFO] [log.py] [L57]-Wave file has been generated: C:\ Users\ xxx\ Desktop\ 115.wav
The generated audio is as follows
Asr speech recognition
I use the audio generated by tts for asr recognition to see the effect. The command is as follows:
Paddlespeech asr-lang zh-input C:\ Users\ xxx\ Desktop\ 115.wav
The execution result is as follows
You can see that the last printed content is unpunctuated text output, or relatively accurate.
Punctuation recovery
Try punctuation recovery with this sentence. The command is as follows:
Paddlespeech text-- task punc-- input Nanjing is very cold now. Go to the Confucius Temple next time.
Execution result
This is the answer to the question about how to implement voice and word processing based on Python PaddleSpeech. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel to learn more about it.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.