How to write a voice playback software with Python 07/09 Update SLTechnology News&Howtos

How to write a voice playback software with Python

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

How to use Python to write a voice playback software, in view of this problem, this article introduces the corresponding analysis and answers in detail, hoping to help more partners who want to solve this problem to find a more simple and easy way.

Units often use broadcasting to notify temporary matters (convert text into voice and broadcast through a power amplifier), but most voice playback software on the market is charged, either with distorted pronunciation or unstable-there are often inexplicable failures, which are easy to bring passivity to work. Learning Python for so long would be better to write your own voice broadcast software, even if something goes wrong, you can fix it on your own.

one

Interface design

Of course, to analyze the requirements before starting construction, my core function is to notify the input software of a piece of text, and then convert it into voice and play it out.

Although this feature is not complex, it also requires an interactive interface, so I decided to use Tkinter to implement this feature.

* step: create a form

Set the title, size and other elements, and set them to unchangeable size in order to avoid display confusion. The code is as follows:

Step two, set up a control

For the text used to accept income, select the Text with scroll bar here, as follows:

Step 3, provide options

As a voice playback software, the most basic style settings such as speed and tone are still necessary. Here, Combobox control is used to provide fixed options, and users can choose different pronunciation, speed and intonation according to the situation.

The fourth step is to establish a trigger interface for functional events.

Set up three Button controls to trigger Voice playback, text cleanup, and Interface exit functions, respectively.

The final interface effect is as follows:

two

Voice playback

The functions such as "clear" and "exit" are relatively simple. Here, we focus on the core function-voice playback.

1)。 Voice interface

It is recommended to use Baidu Cloud's REST API interface to convert text to voice. Log in to the website http://ai.baidu.com/, go to the console-Voice Technology page, and create your own voice application (figure below). The three parameters AppID, API Key and Secret Key will be used in the code.

Then use pip install baidu-aip to install the python SDK module, let's take a look at the function prototype:

APP_ID = 'XXXXXX' API_KEY =' XXXXXXXXXXXXX' SECRET_KEY = 'XXXXXXXXXXXXXXXXXXXXXX' client = AipSpeech (APP_ID, API_KEY, SECRET_KEY) result = client.synthesis (text,' zh', 1, {'per':1,' vol':15, 'pit':9,' spd':5})

Text: text that needs to be converted.

Per: pronunciation person chooses: 0 is female voice, 1 is male voice, 3 is emotional synthesis-du Xiaoyao, 4 is emotional synthesis-du Ya, default is ordinary female voice. Vol: volume. Value range: 0-15. Default is 5 medium volume.

Pit: tone. Value: 0-9. Default is 5 middle tone.

Spd: speech speed, with a value of 0-9. Default is 5 medium speech speed. 'zh' and 1 are voice mode and client type, respectively. Both of them are fixed values and cannot be modified.

It can be seen that the three styles of pronunciation, tone and speed we need can be realized by modifying the parameters.

2)。 Function design

After solving the problem of speech synthesis interface, we can combine the interface settings to achieve specific functions.

First of all, it is necessary to correspond the speech style options in the interface with the speech synthesis function parameters one by one, which is a typical corresponding relationship between keys and values, and it is more appropriate to use a dictionary as a data structure.

Then, for the pronunciation style, three modes of male voice, female voice and mixed voice are selected.

* * for tone and speed, there is no need to set a precision level that is too fine. Here, three levels with obvious span are selected to distinguish them.

When the play button is clicked, the text is read from the Text control; if the text is empty, a pop-up prompt asks for re-entry; if the text is not empty, the text is converted to an audio file and played using playsound.

There is a problem that needs to be paid special attention to, that is, during the operation of the software, the audio file generated and played cannot be deleted, modified, or overwritten, so the name of the audio file generated by each conversion must not be duplicated, otherwise, when carrying out multiple "play" operations, it will fail because the newly generated audio file cannot be saved.

three

Packaging and packaging

So far, the operation of this software depends on the local python development environment and cannot be easily provided to others. It is recommended that the third-party library of Pyinstaller package the python program. First of all, enter the directory where the py file is located and execute the following cmd command. The function of "- w" is not to display the command window, and tk_voice is the name of the py file just now.

Pyinstaller-w tk_voice.py

At this point, a dist folder will be generated in the same directory, which is the packaged program file. When we run the .exe file, the previously designed program interface will appear and enter a test text in the text box: "attention, all personnel, please go downstairs and assemble for dinner immediately." Click the "play" button to try the effect

Insert audio-test .mp3

*. There are several points to pay attention to about the use of Pyinstaller:

This method is only applicable to the windows system, and has relatively strict requirements on the system version, for example, the packaged program under the 64-bit system can not run under the 32-bit system.

If some external pictures or other resource files are called in the program that needs to be packaged, you need to manually copy them into the packaged folder, because Pyinstaller will not package these files.

If you fail to package with Pyinstaller, the contents of the original py file may be lost, so make a backup before packaging.

Use import to import other libraries as selectively as possible, do not import the entire library, or the packaged file will be very large.

Using python to write a voice playback software, mainly involving Tkinter, baidu-aip, playsound, pyinstaller several libraries, can achieve basic voice synthesis and playback functions, can run away from the python development environment, easy to maintain and expand, the disadvantage is that the interface is relatively simple, the function is relatively simple, interested partners can modify and improve themselves.

This is the answer to the question about how to write a voice playback software with Python. I hope the above content can be of some help to everyone. If you still have a lot of doubts to be solved, you can follow the industry information channel to learn more about it.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.