How to construct Neural Network in Python 04/11 Update SLTechnology News&Howtos

How to construct Neural Network in Python

2025-04-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article introduces how to build a neural network in Python. The content is very detailed. Interested friends can use it for reference. I hope it will be helpful to you.

The working principle of Neural Network

In a simple neural network, neurons are the basic computing units. They take input features and use them as output. Here is what the basic neural network looks like:

Here, "layer1" is the input feature. "Layer1" enters another node layer2 and finally outputs the predicted class or hypothesis. Layer2 is a hidden layer. You can use multiple hidden layers.

You must design your neural network according to your data set and precision requirements.

Forward propagation

The process of moving from layer 1 to layer 3 is called forward propagation. Steps for forward propagation:

Initialize the coefficient theta for each input feature. For example, we have 100 training examples. This means 100 rows of data. In this case, if we assume 10 input characteristics, the size of our input matrix is 100x10. Now determine the size of $θ _ 1 $. The number of rows needs to be the same as the number of input features. In this case, it's 10. The number of columns should be the size of the hidden layer you choose.

Multiply the input feature X by the corresponding theta and add an offset term. The result is passed by activating the function.

Several activation functions are available, such as sigmoid,tanh,relu,softmax,swish

I will use a sigmoid activation function to demonstrate the neural network.

Here, "a" represents the hidden layer or layer 2, and b represents the bias.

G (z) is the sigmoid activation function:

Initialize $\ theta_2 $for the hidden layer. The size will be the length of the hidden layer multiplied by the number of output classes. In this example, the next layer is the output layer, because we don't have more hidden layers.

Then we need to follow the same process as before. The prediction output is obtained by multiplying θ and the hidden layer through the sigmoid activation layer.

Back propagation

Back propagation is the process of moving from the output layer to the second layer. In this process, we calculate the error.

First, subtract the predicted output from the original output y, which is our $\ delta_3 $.

Now, calculate the gradient of $\ theta_2 $. Multiply $\ delta_3 $by $\ theta_2 $. Times "$a ^ 2 $" times "$1-a ^ 2 $". In the following formula, the superscript 2 on "a" indicates layer 2. Please don't misinterpret it as square.

The gradient $\ delta$ without regularization is calculated with the number of training samples m.

Training network

Fixed $\ delta$. Multiply the input features by $\ delta_2 $multiplied by the learning rate to get $\ theta_1 $. Notice the dimension of $\ theta_1 $.

Repeat the process of forward propagation and back propagation, and constantly update the parameters until the best cost is achieved. This is the formula of the cost function. Just to remind you, the cost function shows how far the forecast is from the original output variable.

If you notice, this cost function formula is almost the same as the logical regression cost function.

Realization of Neural Network

I will use the dataset of Andrew Ng's machine learning course in Coursera. Please download the dataset from the following link:

Https://github.com/rashida048/Machine-Learning-With-Python/blob/master/ex3d1.xlsx

The following is a step-by-step neural network. I encourage you to run each line of code yourself and print it out to better understand it.

First import the necessary packages and datasets.

Import pandas as pdimport numpy as npxls = pd.ExcelFile ('ex3d1.xlsx') df = pd.read_excel (xls,' Xbox, header = None)

These are the first five rows of the dataset. These are the pixel values of numbers.

In this dataset, input and output variables are organized in separate excel tables. Let's import the output variables:

Y = pd.read_excel (xls, 'yearly, header=None)

These are also the first five rows of the dataset. The output variable is a number from 1 to 10. The goal of this project is to use input variables stored in 'df' to predict numbers.

Find the dimension of input and output variables

Df.shapey.shape

The shape of the input variable or df is 5000 x 400, and the shape of the output variable or y is 5000 x 1.

Define neural network

For simplicity, we will use only one hidden layer of 25 neurons.

Hidden_layer = 25

Get the output class.

Y_arr = y [0] .unique () # output: array ([10,1,2,3,4,5,6,7,8,9], dtype=int64)

As you can see above, there are 10 output classes.

Initialize theta and bias

We will randomly initialize theta of layer 1 and layer 2. Because we have three layers, there will be $\ theta_1 $and $\ theta_2 $.

Dimension of $\ theta_1 $: size of layer 1 x size of layer 2

Dimension of $\ theta_2 $: size of layer 2 x size of layer 3

Starting from step 2, the shape of "df" is 5000 x 400. This means there are 400 input features. So the size of the first floor is 400. When we specify a hidden layer size of 25, the size of layer 2 is 25. We have 10 output classes. So, the size of the third layer is 10.

Dimension of $\ theta_1 $: 400x25

Dimension of $\ theta_2 $: 25 × 10

Similarly, there will be two randomly initialized biases b1 and b2.

Dimension of $bread1 $: the size of layer 2 (25 in this case)

Dimension of $bread1 $: the size of layer 3 (10 in this case)

Define a function that initializes theta randomly:

Def randInitializeWeights (Lin, Lout): epi = (6 seconds 1 epi return w 2) / (Lin + Lout) * * 0.5w = np.random.rand (Lout, Lin) * (2*epi)-epi return w

Use this function to initialize theta

Hidden_layer = 25output = 10theta1 = randInitializeWeights (len (df.T), hidden_layer) theta2 = randInitializeWeights (hidden_layer, output) theta = [theta1, theta2]

Now, initialize the bias we discussed above:

B1 = np.random.randn (25,) b2 = np.random.randn (10,)

Achieve forward propagation

Use the formula in the forward propagation section.

For convenience, define a function to multiply theta and X.

Def z_calc (X, theta): return np.dot (X, theta.T)

We will also use the activation function multiple times. Also define a function

Def sigmoid (z): return 1 / (1 + np.exp (- z))

Now I will demonstrate forward communication step by step. First, calculate the z term:

Z1 = z_calc (df, theta1) + b1

Now pass this Z1 through the activation function to get the hidden layer

A1 = sigmoid (Z1)

A1 is the hidden layer. The shape of A1 is 5000 x 25. Repeat the same process to calculate layer 3 or output layer

Z2 = z_calc (A1, theta2) + b2a2 = sigmoid (Z2)

The shape of a2 is 5000 x 10. Ten columns represent ten classes. A2 is our layer 3 or final output. If there are more hidden layers in this example, there will be more repetitive steps in the process from one layer to another. This process of calculating the output layer using input characteristics is called forward propagation.

L = 3 # layers b = [b1, b2] def hypothesis (df, theta): a = [] z = [] for i in range (0, lmelo1): Z1 = z_calc (df, theta [I]) + b [I] out = sigmoid (Z1) a.append (out) z.append (Z1) df = out return out, a, z

Achieve back propagation

This is the process of calculating the gradient in reverse and updating theta. Before that, we need to modify'y'. We have 10 classes in "y". But we need to separate each class in its column. For example, for a column of type 10. We will replace 1 for 10 and 0 for the rest of the classes. This way we will create a separate column for each class.

Y1 = np.zeros ([len (df), len (y_arr)]) y1 = pd.DataFrame (y1) for i in range (0, len (y_arr)): for j in range (0, len (y1)): if y [0] [j] = = y_arr [I]: y1.iloc [j, I] = 1 else: y1.iloc [j, I] = 0y1.head ()

I demonstrated forward propagation step by step, and then put it all in one function, and I'm going to do the same thing for back propagation. Using the gradient formula for the back propagation part above, calculate $\ delta_3 $first. We will use Z1, Z2, A1, and a2 in the forward propagation implementation.

Del3 = y1-a2

Now calculate the delta2 using the following formula:

Here is delta2:

Del2 = np.dot (del3, theta2) * A1 * (1-A1)

Here we need to learn a new concept. This is a sigmoid gradient. The formula of sigmoid gradient is:

If you notice, this is exactly the same as * * a (1mura) * * in delta's formula. Because an is sigmoid (z). Let's write a function about the sigmoid gradient:

Def sigmoid_grad (z): return sigmoid (z) * (1-sigmoid (z))

Finally, update theta with the following formula:

We need to choose a learning rate. I chose 0.003. I encourage you to try other learning rates and see how it performs:

Theta1 = np.dot (del2.T, pd.DataFrame (A1)) * 0.003theta2 = np.dot (del3.T, pd.DataFrame (a2)) * 0.003

This is how theta needs to be updated. This process is called back propagation because it moves backwards. Before writing a back propagation function, we need to define a cost function. Because I will include the cost calculation in the back propagation method. But it can be added to the forward propagation, or it can be separated when training the network.

Def cost_function (y, y_calc, l): return (np.sum (np.sum (- np.log (y_calc) * y-np.log (1-y_calc) * (1mury) / m

Here m is the number of training instances. The combined code:

M = len (df) def backpropagation (df, theta, y1, alpha): out, a, z = hypothesis (df, theta) delta = [] delta.append (y1mura [- 1]) I = l-2 while I > 0: delta.append (np.dot (delta [- I], theta [- I]) * sigmoid_grad (z [- (I)])) I-= 1 theta [0] = np.dot (delta [- 1] .T) Df) * alpha for i in range (1, len (theta)): theta [I] = np.dot (delta [- (item1)] .T, pd.DataFrame (a [0])) * alpha out, a, z = hypothesis (df, theta) cost = cost_function (y1, a [- 1], 1) return theta, cost

Training network

I will train the network with 20 epoch. I initialize theta again in this code snippet.

Theta1 = randInitializeWeights (len (df.T), hidden_layer) theta2 = randInitializeWeights (hidden_layer, output) theta = [theta1, theta2] cost_list = [] for i in range (20): theta, cost= backpropagation (df, theta, y1, 0.003) cost_list.append (cost) cost_list

I used a learning rate of 0.003 and ran 20 epoch. But see the GitHub link provided at the end of the article. I have tried to train the model with different learning rates and different epoch numbers.

We get the cost of each epoch calculation, as well as the final updated theta. The final theta is used to predict the output.

Predict the output and calculate the accuracy

Just use the hypothetical function and pass the updated theta to predict the output:

Out, a, z = hypothesis (df, theta)

Now calculate the accuracy.

Accuracy= 0for i in range (0, len (out): for j in range (0, len (out [I])): if out [I] [j] > = 0.5 and y1.iloc [I, j] = = 1: accuracy + = 1accuracy/len (df) about how to build a neural network in Python is shared here. I hope the above content can be helpful to you and learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.