Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the difference between static RNN and dynamic RNN

2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)05/31 Report--

Most people do not understand the knowledge points of this article "what is the difference between static RNN and dynamic RNN", so the editor summarizes the following content, detailed content, clear steps, and has a certain reference value. I hope you can get something after reading this article. Let's take a look at this "what is the difference between static RNN and dynamic RNN" article.

1. Static RNN

The function static_rnn () creates an expanded RNN network by connecting memory units. The following code creates an RNN network that is exactly the same as the one we created in the previous period.

X0 = tf.placeholder (tf.float32, [None, n_inputs])

X1 = tf.placeholder (tf.float32, [None, n_inputs])

Basic_cell = tf.contrib.rnn.BasicRNNCell (num_units=n_neurons)

Output_seqs, states = tf.contrib.rnn.static_rnn (

Basic_cell, [X0, X1], dtype=tf.float32)

Y0, Y1 = output_seqs

First, as before, we created two placeholder for entering data, and then we created BasicRNNCell (think of this function as a factory for creating memory units) to build the expanded RNN network. Next, call static_rnn (), which inputs the created cell, the entered tensor, and the type of tensor. For each input, static_rnn () calls the _ _ call__ () function of the memory unit, creating two copy of two memory units, each copy containing a network of five cyclic neurons, with shared variables and biases, connected as before.

The static_rnn () function returns two objects, one of which is a list that contains the tensor output at each moment, and the other object is a tensor that contains the final state of the network. If we use the most basic memory unit, then the final state and output are consistent.

If there are 50 moments, it would be troublesome to define 50 input placeholder and 50 output tensor. In addition, 50 placeholder and 50 output tensor have to be transmitted during execution. Do you think it's troublesome? Since it is troublesome, there will be a simple way, as follows:

X = tf.placeholder (tf.float32, [None, n_steps, n_inputs])

X_seqs = tf.unstack (tf.transpose (X, perm= [1,0,2]))

Basic_cell = tf.contrib.rnn.BasicRNNCell (num_units=n_neurons)

Output_seqs, states = tf.contrib.rnn.static_rnn (

Basic_cell, X_seqs, dtype=tf.float32)

Outputs = tf.transpose (tf.stack (output_seqs), perm= [1,0,2])

As above, we only need to use one placeholder, whose shape is [None, n_steps, n_inputs], where the first dimension None represents the size of the mini-batch, and the second dimension represents the number of time steps, that is, how many moments there are. The third dimension is the input size for each moment. X_seqs is a tensor with n_steps shape of [None, n_inputs]. In order to convert to this form, we have to exchange the first two dimensions through the transpose () function, and the time step becomes the first dimension after the conversion. We can then convert it into a python list based on the first dimension through the unstack () function. The next two lines are the same as before, creating a static RNN, and finally we merge all the output tensor into a tensor through the stack () function, and finally exchange the first two dimensions of the output tensor into [None, n_steps, n_inputs] form. In this way, we can run the network by transmitting a tensor containing a mini-batch sequence. As follows:

X_batch = np.array ([

# t = 0 t = 1

[[0, 1, 2], [9, 8, 7]], # instance 0

[[3, 4, 5], [0, 0, 0]], # instance 1

[[6, 7, 8], [6, 5, 4]], # instance 2

[[9, 0, 1], [3, 2, 1]], # instance 3

])

With tf.Session () as sess:

Init.run ()

Outputs_val = outputs.eval (feed_dict= {X: X_batch})

At the end of the run, we get a tensor (outputs_val) that contains the output of each sample on each neuron at each time. As follows:

> print (outputs_val)

[[- 0.2964572 0.82874775-0.34216955-0.75720584 0.19011548]

[0.51955646 1. 0.99999022-0.99984968-0.24616946]

[- 0.12842922 0.99981797 0.84704727-0.99570125 0.38665548]

[- 0.70553327-0.11918639 0.48885304 0.08917919-0.26579669]

[0.04731077 0.99999976 0.99330056-0.999933 0.55339795]

[- 0.32477224 0.99996376 0.99933046-0.99711186 0.10981458]

[0.70323634 0.99309105 0.99909431-0.85363263 0.7472108]

[- 0.43738723 0.91517633 0.97817528-0.91763324 0.11047263]]

However, a graph created in this way creates a unit at each moment, and for 50 moments, it looks ugly, a bit like writing 50 times without a loop. Y0 (0, X0), Y1 (Y0, X1), Y2 (Y1, X2), Y50 (Y49) X50. Such a large picture will definitely take up a lot of memory space, especially in reverse propagation, for a memory-constrained GPU, it is simply fatal. Because the gradient needs to be calculated with all the forward propagation weights in reverse propagation, the forward propagation values have to be saved. So for limited memory, turn off, wash and go to sleep.

Soldiers come to cover up the water. Fortunately, we can use dynamic RNN to solve this problem, so what is dynamic RNN?

two。 Dynamic RNN

The function of dynamic RNN is dynamic_rnn (), which uses an operation of while_loop () inside, which dynamically adjusts the parameters to run the network according to how many times there are. We can also set swap_memory=True to avoid running out of memory during reverse transfer. This setting allows CPU memory to be swapped with GPU memory during reverse transfer.

Conveniently, dynamic RNN also receives a tensor,shape of [None, n_steps, n_inputs] for all inputs, and outputs a tensor with a shape of [None, n_steps, n_inputs] as before. And you don't have to move around through functions such as unstack,stack,transpose as before. The following code creates the same RNN as before with dynamic_rnn (). It's more beautiful!

X = tf.placeholder (tf.float32, [None, n_steps, n_inputs])

Basic_cell = tf.contrib.rnn.BasicRNNCell (num_units=n_neurons)

Outputs, states = tf.nn.dynamic_rnn (basic_cell, X, dtype=tf.float32)

The above is about the content of this article on "what is the difference between static RNN and dynamic RNN". I believe we all have some understanding. I hope the content shared by the editor will be helpful to you. If you want to know more about the relevant knowledge, please pay attention to the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report