How python generates Music through Deep Neural Network 07/13 Update SLTechnology News&Howtos

How python generates Music through Deep Neural Network

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces "how python generates music through deep neural network". In daily operation, it is believed that many people have doubts about how python generates music through deep neural network. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts about "how python generates music through deep neural network". Next, please follow the editor to study!

Musical expression of Machine Learning Model

We will use ABC music symbols. ABC notation is an abbreviated form of music that uses the letters a to G to represent notes and uses other elements to place added value. These added values include note length, keys and decoration.

This form of symbol starts as an ASCII character set code for online music sharing, adding a new and simple language for software developers to use. The following is the ABC music symbol.

The line in part 1 of music notation shows a letter followed by a colon. These represent various aspects of the melody, such as index (X:), title (T:), time signature (M:), default note length (L:), melody type (R:), and key (K:) when there are multiple tunes in the file. The key name is followed by a melody.

Music data set

In this article, we will use the open source data available on Nottingham Music Database ABC. It contains more than 1000 folk tunes, most of which have been converted into ABC symbols: http://abc.sourceforge.net/NMD/

Data processing.

The data is currently in a character-based classification format. In the data processing stage, we need to convert the data into a numerical format based on integers to prepare for the work of the neural network.

Here each character is mapped to a unique integer. This can be done using a single line of code. The "text" variable is the input data.

Char_to_idx = {ch: I for (I, ch) in enumerate (sorted (list (set (text)}

To train the model, we use vocab to convert the entire text data into a digital format.

T = np.asarray ([char_to_ IDX [c] for c in text], dtype=np.int32) Model selection for Machine Learning Music Generation

In the traditional machine learning model, we cannot store the previous stage of the model. However, we can use cyclic neural networks (commonly known as RNN) to store previous phases.

RNN has a repeating module that takes input at the previous level and uses its output as input to the next level. However, RNN can only retain information from the most recent stage, so our network needs more memory to learn about long-term dependencies. This is the long-term and short-term memory network (LSTMs).

LSTMs is a special case of RNNs, which has the same chain structure as RNNs, but has different repetitive module structure.

RNN is used here because:

The length of the data does not need to be fixed. For each input, the data length may be different.

You can store sequences.

Various combinations of input and output sequence lengths can be used.

In addition to the normal RNN, we will customize it to suit our use cases by adding some adjustments. We will use "character level RNN". In character RNNs, input, output, and conversion output are all in the form of characters.

RNN

Since we need to generate output on each timestamp, we will use a lot of RNN. To implement multiple RNN, we need to set the parameter "return_sequences" to true to generate each character on each timestamp. You can understand it better by looking at figure 5 below.

In the image above, the blue unit is the input unit, the yellow unit is the hidden unit, and the green unit is the output unit. This is a brief overview of many RNN.

Time distribution fully connected layer

To handle the output of each timestamp, we create a fully connected layer of time distribution. To achieve this, we create a fully connected layer of time distribution on top of the output generated by each timestamp.

Status

By setting the parameter stateful to true, the output of the batch is passed as input to the next batch. After combining all the features, our model will be summarized as shown in figure 6 below.

The code snippet of the model architecture is as follows:

Model = Sequential () model.add (Embedding (vocab_size, 512, batch_input_shape= (BATCH_SIZE, SEQ_LENGTH)) for i in range (3): model.add (LSTM (256256return_sequences=True, stateful=True) model.add (Dropout) model.add (TimeDistributed (Dense (vocab_size)) model.add (Activation ('softmax')) model.summary () model.compile (loss='categorical_crossentropy', optimizer='adam', metrics= [' accuracy'])

I strongly recommend that you use layers to improve performance.

Dropout layer

Dropout layer is a regularization technique. In the process of training, a small part of the input unit is returned to zero during each update to prevent over-fitting.

Softmax layer

The generation of music is a multi-class classification problem, and each class is a unique character in the input data. Therefore, we use a softmax layer on our model and take the classification cross-entropy as a loss function.

This layer gives the probability of each class. From the list of probabilities, we choose the one with the highest probability.

Optimizer

To optimize our model, we use adaptive moment estimation, also known as Adam, because it is a good choice for RNN.

Generate music

So far, we have created a RNN model and trained it based on our input data. The model learns the pattern of input data during the training phase. We call this model the "training model".

The input size used in the training model is the batch size. For music generated by machine learning, the input size is a single character. So we created a new model, which is similar to the "training model", but the size of the input character is (1). In this new model, we copy the characteristics of the training model by loading weights from the training model.

Model2 = Sequential () model2.add (Embedding (vocab_size, 512, batch_input_shape=)) for i in range (3): model2.add (LSTM (256th, return_sequences=True, stateful=True) model2.add (Dropout (0.2)) model2.add (TimeDistributed (Dense (vocab_size) model2.add (Activation ('softmax'))

We load the weights of the trained model into the new model. This can be done by using a single line of code.

Model2.load_weights (os.path.join (MODEL_DIR,'weights.100.h6'.format (epoch)) model2.summary ()

During the music generation process, the first character is randomly selected from a unique character set, the previously generated character is used to generate the next character, and so on. With this structure, we have music.

The following is a code snippet that helps us achieve this.

Sampled = [] for i in range (1024): batch = np.zeros ((1,1)) if sampled: batch [0,0] = sampled [- 1] else: batch [0,0] = np.random.randint (vocab_size) result = model2.predict_on_batch (batch). Ravel () sample = np.random.choice (range (vocab_size)) P=result) sampled.append (sample) print ("sampled") print (sampled) print ('. Join (idx_to_ char [c] for c in sampled))

We use a machine learning neural network called LSTMs to generate these pleasant music samples. Each segment is different, but similar to the training data. These melodies can be used for a variety of purposes:

Enhance the creativity of artists through inspiration

As a productivity tool for developing new ideas

As an additional tune to an artist's work

Finish the unfinished work

As an independent piece of music

However, this model needs to be improved. There is only one kind of musical instrument in our training materials, the piano. One way we can enhance the training data is to add music from a variety of instruments. Another way is to increase the genre, rhythm and rhythmic characteristics of music.

At present, our model has produced some false notes, and music is no exception. We can reduce these errors and improve the quality of music by adding training data sets.

At this point, the study of "how python generates music through deep neural networks" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.