How to understand tensorflow 07/01 Update SLTechnology News&Howtos

How to understand tensorflow

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

How to understand tensorflow, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain for you in detail, people with this need can come to learn, I hope you can gain something.

What is tensorflow?

TensorFlow ™is an open source software library that uses data flow graph (data flow graphs) for numerical computation. The Nodes represents mathematical operations in the graph, and the edges in the graph represents a multidimensional array of data related to each other, namely tensor (tensor). Its flexible architecture allows you to deploy computing on a variety of platforms, such as one or more CPU (or GPU) in desktop computers, servers, mobile devices, and so on. TensorFlow was originally developed by researchers and engineers of the Google brain team (part of the Google Machine Intelligence Research Institute) for machine learning and deep neural networks, but the versatility of this system makes it widely used in other computing fields.

1. What is the data flow diagram?

The data flow graph uses the directed graph of "nodes" and "edges" to describe mathematical calculations. "node" is generally used to represent the mathematical operation applied, but it can also represent the starting point / output (push out) of data input (feed in), or the end point of reading / writing persistent variables (persistent variable). "line" represents the input / output relationship between "nodes". These data "lines" can transport "size dynamically adjustable" multidimensional data arrays, namely "tensors" (tensor). The visual image of the tensor flowing through the graph is the reason why the tool is called "Tensorflow". Once all the tensors at the input are ready, the node will be assigned to various computing devices to perform operations asynchronously and in parallel.

Summary: data flow graph is a weighted directed graph (my own understanding)

2. Characteristics of tensorflow

High degree of flexibility

TensorFlow is not a strict "neural network" library. As long as you can represent your calculation as a data flow graph, you can use Tensorflow. You build the diagram and describe the internal loops that drive computing. We provide useful tools to help you assemble "subgraphs" (often used in neural networks). Of course, users can also write their own "upper libraries" based on Tensorflow. Defining new composite operations that are easy to use is as easy as writing a python function, and you don't have to worry about performance loss. Of course, in case you can't find the underlying data operations you want, you can also write a little C++ code to enrich the underlying operations.

True portability (Portability)

Tensorflow runs on CPU and GPU, for example, on desktops, servers, mobile devices, and so on. Want to run a new idea of machine learning on your laptop without special hardware? Tensorflow can do this. Are you ready to scale your training model on multiple CPU without modifying the code? Tensorflow can do this. Want to use your trained model in your mobile phone app as part of the product? Tensorflow can do this. Have you changed your mind and want to run your model on your own server as a cloud service or in a Docker container? Tensorfow can do it. Tensorflow is so arrogant:)

Link scientific research with products

In the past, if machine learning ideas from scientific research were to be applied to products, it required a lot of code rewriting. Those days are gone forever! At Google, scientists use Tensorflow to try new algorithms, and product teams use Tensorflow to train and use computational models and provide them directly to online users. The use of Tensorflow allows applied researchers to quickly apply ideas to products, and it also allows academic researchers to share code with each other more directly, thus increasing the yield of scientific research.

Automatic differentiation

The gradient-based machine learning algorithm will benefit from the ability of Tensorflow to find differentiation automatically. As a Tensorflow user, you only need to define the structure of the prediction model, combine this structure with the objective function (objective function), and add data, and Tensorflow will automatically calculate the relevant differential derivatives for you. Calculating the derivatives of one variable relative to other variables is done only by extending your graph, so you can always see exactly what's going on.

Multilingual support

Tensorflow has a reasonable C++ interface and an easy-to-use python interface to build and execute your graphs. You can write python/c++ programs directly, or you can use the interactive ipython interface to try some ideas with Tensorflow, which can help you organize your notes, code, visualization, etc. Of course, this is just a starting point-we hope to encourage you to create your favorite language interface, such as Go,Java,Lua,Javascript, or R.

Performance optimization

For example, if you have another workstation with 32 CPU cores and 4 GPU graphics cards, you want to realize your workstation's full computing potential? Because Tensorflow provides the best support for threads, queues, asynchronous operations, etc., Tensorflow allows you to realize the full computing potential of the hardware at hand. You are free to assign the computing elements in the Tensorflow diagram to different devices, and Tensorflow can help you manage these different copies.

II. The relationship between CNN,RNN,LSTM

Convolution neural network (Convolutional Neural Network CNN):

Convolution neural network (CNN) is a kind of feedforward neural network, which solves the over-fitting and local most solvable problems in DNN (depth network) and NN (neural network). CNN uses the local information of the image to make use of the inherent local patterns in the image (such as various parts of the human body). All the neurons in the upper and lower layers of CNN are not directly connected to each other, but through the "convolution nucleus" as an intermediary. The graph in the same convolution kernel is shared, and the image still retains the original position relationship after the convolution operation.

The basic structure of CNN consists of two layers, one is the feature extraction layer, and the input of each neuron is connected to the local reception domain of the previous layer, and the local features are extracted. Once the local feature is extracted, the position relationship between the local feature and other features is also determined; the second is the feature mapping layer, where each computing layer of the network is composed of multiple feature maps, each feature mapping is a plane, and all neurons in the plane have equal weights. In the feature mapping structure, the sigmoid function with small influence function kernel is used as the activation function of the convolution network, which makes the feature mapping displacement invariant. In addition, because the neurons on a mapping surface share weights, the number of free parameters of the network is reduced. Each convolution layer in the convolution neural network is followed by a computational layer used for local averaging and secondary extraction. This unique twice feature extraction structure reduces the feature resolution.

CNN is mainly used to identify displacement, scaling and other forms of distortion-invariant two-dimensional graphics, this part of the function is mainly realized by the pool layer. Because the feature detection layer of CNN learns through the training data, when using CNN, it avoids explicit feature extraction and implicitly learns from the training data; moreover, because the weights of neurons on the same feature mapping surface are the same, the network can learn in parallel, which is also a major advantage of convolution networks over networks where neurons are connected to each other. Convolutional neural network has unique advantages in speech recognition and image processing because of its special structure of local weight sharing, its layout is closer to the actual biological neural network, and weight sharing reduces the complexity of the network. especially the characteristic that the image of multi-dimensional input vector can be input directly into the network avoids the complexity of data reconstruction in the process of feature extraction and classification.

RNN cyclic neural network (Recurrent Neural Network RNN):

Cyclic neural network RNN neural network is a kind of artificial neural network with nodes directionally connected into a loop. The internal state of this kind of network can show the dynamic temporal behavior. Unlike the feedforward neural network (CNN), RNN can use its internal memory to process input sequences of any time sequence, which makes it easier to process, such as unsegmented handwriting recognition, speech recognition and so on. In RNN, the output of the neuron can act directly on itself at the next timestamp, that is, the input of the layer I neuron at the m time, including not only the output of the layer I neuron at that time, but also the output of the layer I neuron at the time. As shown below:

This structure of cyclic neural network is very suitable for dealing with data samples with dependence before and after. Because of this chain structure, cyclic neural networks are closely connected with sequences and lists. Therefore, RNN is suitable for dealing with time-based sequences, such as a continuous speech and a continuous handwritten text. Take the language model as an example, according to the first t characters in a given sentence, and then predict the t + 1 character. Suppose our sentence is "Hello world", use feedforward neural network to predict: enter "you" in time 1, predict "good", and time 2 input "good" to the same network to predict "world". The whole process is shown in the following figure:

The t + 1 character can be predicted based on the first n characters. Here it is, nasty 1. At the same time, you can increase n to make the input contain more information. However, we can not increase n arbitrarily, because this usually increases the complexity of the model, resulting in the need for a large amount of data and calculations to train the model.

LSTM (time recurrent neural network):

LSTM (Long Short-Term Memory) is a long-term and short-term memory network and a time recurrent neural network, which is suitable for processing and predicting important events with relatively long intervals and delays in time series. It is a special type of RNN that can learn to rely on information for a long time. The problem of graph dissipation caused by too long sequence in RNN is solved. The RNN is regarded as a neural network transmitted in time, and the interconnection is added between the nodes in the hidden layer. (LSTM is used for CAPTCHA recognition this time.)

LSTM adds a "processor" to the algorithm to determine whether the information is useful or not, and the structure of this processor is called cell.

Three doors are placed in a cell, called the input door, the forget door and the output door. When a message enters the LSTM network, you can judge whether it is useful according to the rules. Only the information that conforms to the algorithm authentication will be left, and the inconsistent information will be forgotten through the forgetting door.

In standard RNN, repeating modules have a simple structure, such as a single tanh layer, as shown in the following figure:

H (t) is used to calculate the model loss of the current layer on the one hand and to calculate the h (tweak 1) of the next layer on the other hand.

The structure of LSTM is much more complex than that of RNN, as shown in the following figure:

The key to LSTM is the cellular state, the horizontal line that runs through the top of the chart:

LSTM removes or adds information to the cellular state, a well-designed structure called a gate. LSTM has three kinds of door structures.

1. Forgetting door

As the name suggests, the amnesia door decides which information about the state of the cell is lost. According to h (tmur1) and x (t), the forgetting gate outputs a number between 0 and 1 for state C (tmur1), where 0 means "completely discarded" and 1 means "fully accepted". The mathematical expression is:

2. Input door

The input gate consists of two parts: the first part is the sigmoid activation function, the output is I (t), which decides which values to update; the second part is the tanh activation function, the output is C (t). The result of multiplying I (t) and ~ C (t) is used to update the cell state. The mathematical expression is as follows:

3. Output door

Through the forgetting gate and the input gate, the cell status is updated to:

Finally, we should decide what the output is. The output is based on the above cell state, but needs to be filtered. The output door is shown in the following figure:

First, use the sigmoid layer to determine which parts of the output cell state. Then, we pass the cell state through the tanh layer, and the output is multiplied by the output of the sigmoid layer. The mathematical formula is:

Forward propagation algorithm of LSTM:

(1) update the output of the forgetting door:

(2) update the input gate output:

(3) update the status of cells:

(4) Update output door output:

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.