Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Python artificial Intelligence tensorflow how to construct cyclic Neural Network RNN

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)05/31 Report--

This article introduces the relevant knowledge of "python artificial intelligence tensorflow how to build cyclic neural network RNN". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Introduction to RNN

RNN is one of the most popular neural networks, which is good at dealing with sequence data.

What is sequence data? For instance.

Now suppose there are four words, "I", "go", "eat" and "eat". We can arrange and combine them at will.

"I'm going to dinner" means I'm going to dinner.

"go and eat me" means that the meal is done.

"I'm going to eat" means that I'm going to have dinner.

Different arrangement order will lead to different semantic meaning. Sequence data represents a sequence arranged in a certain order, which generally has a certain meaning.

So we know that RNN has the abstract concept of sequential storage, but how does RNN learn this concept?

So, let's look at a traditional neural network, also known as a feedforward neural network. It has input layer, hidden layer and output layer. Like this

For RNN, the schematic diagram of its structure is as follows:

A sentence can be divided into N part, such as "I go to dinner" can be divided into four words, "I", "go", "eat", "rice", can be passed into four hidden layers, the first hidden layer will have an output according to a certain ratio to the latter hidden layer, for example, after the first "I" input implicit layer, an output will be input to the next hidden layer at the ratio of W1. When the second "go" enters the hidden layer, the hidden layer also receives the message from "I".

And so on, when the last "meal" is reached, the final output gets all the previous information.

Its pseudo code is in the form of:

Rnn = RNN () ff = FeedForwardNN () hidden_state = for word in input: output,hidden_state = rnn (word,hidden_state) prediction = ff (output) tensorflow RNN correlation function tf.nn.rnn_cell.BasicLSTMCelltf.nn.rnn_cell.BasicRNNCell (num_units, activation=None, reuse=None, name=None, dtype=None, * * kwargs)

The number of neurons in the num_units:RNN unit, that is, the number of output neurons.

Activation: activate the function.

Reuse: describes whether variables are reused in an existing scope. If it is not True and the existing scope already has a given variable, an error is thrown.

Name: the name of the layer.

Dtype: the data type of this layer.

Kwargs: keyword naming attributes for common layer attributes, such as trainable, when cell is created from get_config ().

When in use, it can be defined as:

RNN_cell = tf.nn.rnn_cell.BasicRNNCell (natives, agents, agents)

After the definition is complete, you can initialize the state:

_ init_state = RNN_cell.zero_state (batch_size,tf.float32) tf.nn.dynamic_rnntf.nn.dynamic_rnn (cell, inputs, sequence_length=None, initial_state=None, dtype=None, parallel_iterations=None, swap_memory=False, time_major=False, scope=None)

Cell: the lstm_cell defined above.

Inputs:RNN input. If time_major==false (default), it must be the tensor of the following shape: [batch _ size,max_time,...] Or a nested tuple of such elements. If time_major==true, it must be a tensor with the following shape: [Max _ time,batch_size,...] Or a nested tuple of such elements.

Sequence_length:Int32/Int64 vector size. Used to copy pass state and zero output when the sequence length of a batch element is exceeded. Therefore, it is more about performance than correctness.

Initial_state: _ init_state as defined above.

Dtype: data type.

Parallel_iterations: the number of iterations running in parallel. Those operations that do not have any time dependence and can be run in parallel will be. This parameter uses time to exchange space. The value > > 1 uses more memory, but takes less time, while smaller values use less memory, but take longer to calculate.

Time_major: the shape format of the input and output tensor. If true, the shape of these tensors must be [max_time,batch_size,depth]. If false, the shape of these tensors must be [batch_size,max_time,depth]. Using time_major=true is more efficient because it avoids transpositions at the beginning and end of RNN calculations. However, most TensorFlow data is batch master data, so by default this function is False.

Scope: the variable scope of the created subgraph; the default is "RNN".

At the end of the RNN, you need to use this function to get the result.

Outputs,states = tf.nn.dynamic_rnn (RNN_cell,X_in,initial_state = _ init_state,time_major = False)

A tuple (outputs, state) is returned:

The output of the last layer of outputs:RNN is a tensor. If time_major== False, its shape is [batch_size,max_time,cell.output_size]. If time_major== True, its shape is [max_time,batch_size,cell.output_size].

States:states is a tensor. State is the final state, that is, the state of the last cell output in the sequence. In general, the shape of states is [batch_size, cell.output_size], but when the input cell is BasicLSTMCell, the shape of states is [2 cell.output_size], where 2 also corresponds to cell state and hidden state in LSTM.

The whole RNN definition process is as follows:

Def RNN: the initial shape of # X is (128 batch,28 steps,28 inputs) # converted to (128 batch*28 steps,128 hidden) X = tf.reshape (X, [- 1 recording nimports]) # after multiplication, the result is (128 batch*28 steps,256 hidden) X_in = tf.matmul (XJ weights ['in']) + biases [' in'] # converted again to (128 batch,28 steps) 256 hidden) X_in = tf.reshape (X_in, [- 1) RNN_cell = tf.nn.rnn_cell.BasicRNNCell (nasty batch_size,tf.float32) outputs,states = tf.nn.dynamic_rnn (RNN_cell,X_in,initial_state = _ init_state,time_major = False) results = tf.matmul (states Weights ['out']) + biases [' out'] return results all code

This example is an example of handwriting recognition, which takes 28 rows of handwriting as the input of each step, and the input dimensions are 28 columns.

Import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_datamnist = input_data.read_data_sets ("MNIST_data" One_hot = "true") lr = 0.001 # Learning rate training_iters = 1000000 # Learning Generation batch_size = 128 # the amount of training per round of training n_inputs = 28 # enter the inputs dimension of each hidden layer n_steps = 28 # divided into 28 input n_hidden_units = 128 # number of neurons in each hidden layer n_classes = 10 # output has a total of 10 x = tf.placeholder (tf.float32 [None,n_steps,n_inputs]) y = tf.placeholder (tf.float32, [None,n_classes]) weights = {'in':tf.Variable (tf.random_normal ([nasty inputsrecording unitsloaded classes])),' out':tf.Variable (tf.random_normal ([naughtified unitsprecinct classes])} biases = {'in':tf.Variable (tf.constant (0.1 hidden_units shape0) 'out':tf.Variable (tf.constant (0.1 batch*28 steps,128 hidden shapex = [n _ classes])} def RNN: # X has the initial shape of (128 batch*28 steps,256 hidden) # transformed into (128 batch*28 steps,256 hidden) X = tf.reshape (X, [- 1 minus inputs]) # multiplied to (128 batch*28 steps,256 hidden) X_in = tf.matmul (X Weights ['in']) + biases [' in'] # is converted again to (128 batch,28 steps,256 hidden) X_in = tf.reshape (X_in, [- 1 recording natives) RNN_cell = tf.nn.rnn_cell.BasicRNNCell (RNN_cell.zero_state (batch_size,tf.float32) outputs,states = tf.nn.dynamic_rnn (RNN_cell) Results = tf.matmul (states,weights ['out']) + biases [' out'] return resultspre = RNN cost = tf.reduce_mean (tf.nn.softmax_cross_entropy_with_logits (logits = pre,labels = y)) train_op = tf.train.AdamOptimizer (lr) .minimize (cost) correct_pre = tf.equal (tf.argmax (out')), tf.argmax (pre) 1)) accuracy = tf.reduce_mean (tf.cast (correct_pre,tf.float32)) init = tf.initialize_all_variables () with tf.Session () as sess: sess.run (init) step = 0 while step*batch_size

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report