In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article to share with you is about the theoretical basis of neural networks and Python implementation is what kind of, Xiaobian feel quite practical, so share to everyone to learn, I hope you can read this article after some gains, not much to say, follow Xiaobian to see it.
Multilayer Forward Neural Network
Multilayer feedforward neural network consists of three parts: output layer, hidden layer and output layer, each layer is composed of units;
The input layer is input by the instance feature vector of the training set, and is transmitted to the next layer through the weight of the connecting nodes, and the output of the previous layer is the input of the next layer;
Except for the input layer, the number of layers of the hidden layer and the output layer is n, then the neural network is called an n-layer neural network, as shown in the following figure for a two-layer neural network;
In theory, if there are enough hidden layers and a large enough training set, any equation can be simulated.
2. Design neural network structure
Before using neural network, we must determine the number of layers of neural network and the number of units in each layer.
To speed up the learning process, feature vectors usually need to be normalized to between 0 and 1 before they are passed into the input layer;
Discrete variables can be encoded into values that may be assigned to one eigenvalue per input unit
For example, the characteristic value A may have three values (a0,a1,a2), so three input units can be used to represent A.
If A=a0, the unit value representing a0 takes 1 and the rest takes 0;
If A=a1, the unit value representing a1 takes 1 and the rest takes 0;
If A=a2, the unit value representing a2 takes 1 and the rest takes 0;
Neural networks can solve both classification and regression problems. For classification problems, if there are two classes, one output unit (0 and 1) can be used to represent the two classes respectively; if there are more than two classes, each class is represented by one output unit, so the number of units in the output layer is usually equal to the number of classes.
There are no explicit rules for designing hidden layers with *** numbers, and experiments are generally improved based on experimental test errors and accuracy.
III. Cross-validation method
How to calculate accuracy? The simplest method is to pass a set of training set and test set, the training set is trained to obtain the model, the test set is input into the model to obtain the test result, and the test result is compared with the true label of the test set to obtain the accuracy rate.
A common approach in machine learning is cross-validation. A set of data is not divided into 2 parts, it may be divided into 10 parts,
1st time: the 1st one is used as test set, and the remaining 9 are used as training set;
The second time: the second one is used as the test set, and the remaining nine are used as the training set;
……
After 10 times of training, 10 groups of accuracy are obtained, and the average accuracy is obtained by averaging the 10 groups of data. Here 10 is the exception. In general, the data is divided into k parts, and the algorithm is called K-fold cross validation, that is, one of the k parts is selected as the test set, the remaining k-1 part is used as the training set, and k times are repeated to finally obtain the average accuracy rate. It is a more scientific and accurate method.
IV. BP algorithm
Iterative processing of instances in the training set;
Comparing the difference between the predicted value and the true value after passing through the neural network;
The weight of each connection is updated in the reverse direction (from output layer => hidden layer => input layer) to minimize the error;
4.1, algorithm details
Input: dataset, learning rate, a multilayer neural network architecture;
Output: a trained neural network;
Initialize weights and biases: Randomly initialize between-1 and 1 (or something else), with one bias per cell; for each training instance X, perform the following steps:
1. Forward transmission from input layer:
Combined with neural network schematic analysis:
From input layer to hidden layer:
From hidden layer to output layer:
Summarizing the two formulas, we can obtain:
Ij is the unit value of the current layer, Oi is the unit value of the previous layer, wij is the weight value connecting two unit values between two layers, sitaj is the bias value of each layer. We want to perform a nonlinear transformation on the output of each layer, as shown below:
The output of the current layer is Ij, f is the nonlinear transformation function, also known as the activation function, defined as follows:
The output of each layer is:
This allows you to get the output value of each layer forward from the input value.
2. Transfer backward according to error For the output layer: where Tk is the true value and Ok is the predicted value
For hidden layers:
Weight update: where l is the learning rate
Biased Update:
3. Termination conditions
biased updates are below a certain threshold;
The prediction error rate is below a certain threshold;
reaching a preset number of cycles;
4. Nonlinear transformation function
For the nonlinear transformation function f mentioned above, two functions can be used in general:
(1)tanh(x) function:
tanh(x)=sinh(x)/cosh(x)
sinh(x)=(exp(x)-exp(-x))/2
cosh(x)=(exp(x)+exp(-x))/2
(2)Logic function, logic function used above in this article
V. Python implementation of BP neural network
You need to import numpy module first
import numpy as np
Definition of nonlinear transformation function, because it also needs to use the derivative form of the function, so define it together
def tanh(x): return np.tanh(x) def tanh_deriv(x): return 1.0 - np.tanh(x)*np.tanh(x) def logistic(x): return 1/(1 + np.exp(-x)) def logistic_derivative(x): return logistic(x)*(1-logistic(x))
The design of BP neural network form (several layers, how many units per layer), used object-oriented, mainly to choose which nonlinear function, and initialization weights. layers is a list containing the number of elements in each layer.
class NeuralNetwork: def __init__(self, layers, activation='tanh'): """ :param layers: A list containing the number of units in each layer. Should be at least two values :param activation: The activation function to be used. Can be "logistic" or "tanh" """ if activation == 'logistic': self.activation = logistic self.activation_deriv = logistic_derivative elif activation == 'tanh': self.activation = tanh self.activation_deriv = tanh_deriv self.weights = [] for i in range(1, len(layers) - 1): self.weights.append((2*np.random.random((layers[i - 1] + 1, layers[i] + 1))-1)*0.25) self.weights.append((2*np.random.random((layers[i] + 1, layers[i + 1]))-1)*0.25)
implementation algorithm
def fit(self, X, y, learning_rate=0.2, epochs=10000): X = np.atleast_2d(X) temp = np.ones([X.shape[0], X.shape[1]+1]) temp[:, 0:-1] = X X = temp y = np.array(y) for k in range(epochs): i = np.random.randint(X.shape[0]) a = [X[i]] for l in range(len(self.weights)): a.append(self.activation(np.dot(a[l], self.weights[l]))) error = y[i] - a[-1] deltas = [error * self.activation_deriv(a[-1])] for l in range(len(a) - 2, 0, -1): deltas.append(deltas[-1].dot(self.weights[l].T)*self.activation_deriv(a[l])) deltas.reverse() for i in range(len(self.weights)): layer = np.atleast_2d(a[i]) delta = np.atleast_2d(deltas[i]) self.weights[i] += learning_rate * layer.T.dot(delta)
achieve the predicted
def predict(self, x): x = np.array(x) temp = np.ones(x.shape[0]+1) temp[0:-1] = x a = temp for l in range(0, len(self.weights)): a = self.activation(np.dot(a, self.weights[l])) return a
We give a set of numbers to predict, and our program file above is saved as BP
from BP import NeuralNetwork import numpy as np nn = NeuralNetwork([2,2,1], 'tanh') x = np.array([[0,0], [0,1], [1,0], [1,1]]) y = np.array([1,0,0,1]) nn.fit(x,y,0.1,10000) for i in [[0,0], [0,1], [1,0], [1,1]]: print(i, nn.predict(i))
The results were as follows:
([0, 0], array ([ 0.99738862]))([0, 1], array([ 0.00091329]) ([1, 0], array([ 0.00086846])) ([1, 1], array([ 0.99751259])) The above is the theoretical basis of neural networks and Python implementation is how, Xiaobian believes that some knowledge points may be our daily work will see or use. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.