How to realize CNN Visualization on browser 07/09 Update SLTechnology News&Howtos

How to realize CNN Visualization on browser

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

In this issue, the editor will bring you about how to achieve CNN visualization on the browser. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.

The function of this paper

When learning the convolution neural network, we only know that after inputting a picture, we can extract the features of the picture through a meal of operation. We only have a theoretical understanding of its internal operation, but we do not believe what we see. This CNN interpreter allows us to clearly see how each neuron is generated and what the generated image looks like on the browser.

CNN Neural Network Visualization tool 1

What is convolution neural network?

In machine learning, classifiers assign category labels to data points. For example, an image classifier generates category tags for which objects are present in the image (for example, birds, airplanes). A convolutional neural network, or short for CNN, is a type of classification that solves this problem with its merit!

CNN is a neural network: an algorithm for recognizing data patterns. In general, neural networks are made up of neurons, which are organized in layers, each with its own learnable weights and biases. Let's break down CNN into its basic building blocks.

A tensor can be thought of as an n-dimensional matrix. In the above CNN, the tensor will be three-dimensional, except for the output layer.

A neuron can be seen as a function that occurs in multiple inputs and produces a single output. The output of neurons is shown above as a red → blue activation diagram.

A layer is a set of neurons that simply use the same operation, including the same superparameters.

Kernel weights and deviations (although each neuron is unique) are adjusted during the training phase and allow the classifier to adapt to the problems and data sets provided. They are encoded in yellow → green divergent color marks in visualization. You can view specific values in the Interactive Formula view by clicking the neuron or hovering over the kernel / deviation in the convolution elastic interpretation view.

CNN conveys a differentiable function that is represented as a class score in the visualization of the output layer.

If you have studied neural networks before, you may be familiar with these terms. So, what makes CNN different? CNN uses a special type of layer, appropriately called convolution layer, to put them in place to learn from images and similar image data. With regard to image data, CNN can be used for many different computer vision tasks, such as image processing, classification, segmentation and object detection.

In CNN Explainer, you can see how to use simple CNN for image classification. Because of the simplicity of the network, its performance is not perfect, but it doesn't matter! The network architecture used in CNN Explainer, Tiny VGG, contains many of the same layers and operations used by state-of-the-art CNN today, but on a smaller scale. In this way, it will be easier to understand the introduction.

What does every layer of the network do?

Let's traverse every layer of the network. While reading, click and hover in the visualization above to interact with the visualization above at will.

Input layer

The input layer (the leftmost layer) represents the image input into CNN. Because we use RGB images as input, the input layer has three channels that correspond to the red, green, and blue channels displayed in that layer. Use color stops when you click the icon above the network details icon to display details (on this and other layers).

Convolution layer

Convolution layers are the basis of CNN because they contain learning kernels (weights) that extract features that distinguish different images from each other-this is the classification we want! When interacting with the convolution layer, you will notice the links between the previous layers and the convolution layer. Each link represents a unique kernel that is used for convolution operations to generate the output or activation graph of the current convolution neurons.

The convolution neuron executes the dot product of elements having a unique core and the output of the corresponding neuron in the upper layer. This will produce as many intermediate results as the only kernel. Convolution neurons are the result of the addition of all intermediate results and learned deviations.

For example, let's take a look at the first convolution layer in the Tiny VGG architecture above. Note that there are 10 neurons in this layer, but only 3 neurons in the upper layer. In the Tiny VGG architecture, the convolution layer is fully connected, which means that each neuron is connected to every other neuron in the previous layer. Looking at the output of the convolution neurons at the top of the first convolution layer, when we hover over the activation graph, we see that there are three unique cores.

Figure 1. When you hover over the activation diagram of the topmost node on the first convolution layer, you will see that three cores have been applied to generate this activation diagram. After clicking on this activation diagram, you can see that convolution occurs for each unique kernel.

The sizes of these kernels are superparameters specified by the designer of the network architecture. In order to produce the output of convolution neurons (activation graph), we must perform element-by-point product together with the output of the previous layer and the only kernel learned by the network. In TinyVGG, dot product operations use a span of 1, which means that the kernel moves each dot product by 1 pixel, but this is a superparameter that the network architect can adjust to better fit its dataset. We must do this for all three kernels, which will produce three intermediate results.

Then, the elements and the deviations that contain all three intermediate results and the learned deviations of the network are executed. After that, the resulting two-dimensional tensor will be an activation diagram visible on the upper interface of the uppermost neurons in the first convolution layer. The same operation must be applied to generate an activation map for each neuron.

Through some simple mathematical calculations, we can infer that 3x 10 = 30 unique cores are applied in the first convolution layer, each of which is the size of 3x3. The connectivity between the convolution layer and the upper layer is the design decision when building the network architecture, which will affect the number of cores in each convolution layer. Click the visualization file to better understand the operation behind the convolution layer. See if you can follow the example above!

Understanding hyperparameters

When the kernel extends beyond the activation diagram, it usually needs to be populated. Padding saves data at the boundaries of the active diagram for better performance and helps preserve the size of the input space, allowing architects to build higher-performing, smoother networks. There are many filling techniques, but the most commonly used method is zero filling because of its performance, simplicity and computational efficiency. This technique involves adding zeros symmetrically around the edges of the input. This approach is adopted by many high-performance CNN, such as AlexNet.

Kernel size, often referred to as filter size, refers to the size of the sliding window on the input. Selecting this superparameter will have a significant impact on the task of image classification. For example, a smaller kernel size can extract a large amount of information containing highly local features from the input. As you can see in the visualization above, a smaller kernel size also results in a smaller layer size, which allows for a deeper architecture. In contrast, larger kernels extract less information, which results in a faster decrease in layer size, which often leads to performance degradation. Large kernels are more suitable for extracting larger features. Ultimately, choosing the right kernel size will depend on your task and dataset, but in general, more and more layers learn more and more complex functions together!

The stride indicates how many pixels the kernel should move at a time. For example, as described in the convolution layer example above, Tiny VGG uses a stride of 1 for its convolution layer, which means that a dot product is performed on the input 3x3 window to produce an output value, and then moved to one pixel for each subsequent operation. The effect of span on CNN is similar to kernel size. As the stride decreases, as more data is extracted, more functions can be learned, which leads to a larger output layer. On the contrary, with the increase of stride, the feature extraction will be more limited and the size of the output layer will be smaller. One of the responsibilities of the architect is to ensure that the kernel slides symmetrically across inputs when implementing CNN.

Activate function

ReLU

Neural networks are extremely popular in modern technology-because they are so accurate! Today's highest-performing CNN contains a large number of ridiculous layers that can learn more and more features. Part of the reason why these groundbreaking CNN can achieve such great accuracy is because of their non-linearity. ReLU applies the badly needed nonlinearity to the model. Nonlinearity is necessary to generate nonlinear decision boundaries, so the output cannot be written as a linear combination of inputs. If there is no nonlinear activation function, then the deep CNN architecture will evolve into an equivalent convolution layer with almost different performance. In contrast to other nonlinear functions (such as Sigmoid), ReLU activation functions are specifically used as nonlinear activation functions because it has been empirically observed that CNN using ReLU trains faster than their counterparts.

The ReLU activation function is an one-to-one mathematical operation:

Relation diagram

Figure 3. The ReLU activation function is drawn, ignoring all negative data.

This activation function is applied to each value in the input tensor one by one. For example, if you apply ReLU to a value of 2.24, the result will be 2.24 because 2.24 is greater than 0. You can observe how this activation function is applied by clicking on the ReLU neuron in the network above. The rectifier linear activation function (ReLU) is performed after each convolution layer in the network architecture outlined above. Pay attention to the effect of this layer on the activation diagram of various neurons in the whole network!

Soft maximum

The main purpose of the softmax operation is to ensure that the sum of the CNN output is 1. Therefore, the softmax operation can be used to scale the model output to probability. Clicking the last layer displays the softmax operations in the network. Notice how the flattened logarithm is not scaled between zero and one. To visualize the impact of each logit (scalar values are not scaled), they are encoded using a light orange → dark orange color code. After passing the softmax function, each class now corresponds to an appropriate probability!

You might wonder what's the difference between standard normalization and softmax-after all, both re-adjust logit between 0 and 1. Remember, back propagation is a key aspect of training neural networks-we want the right answer to have the greatest "signal". By using softmax, we can effectively 'approach' argmax while getting differentials. Rescaling does not make the weight of max significantly higher than that of other logit, while the weight of softmax does not. In short, softmax is a "softer" argmax- and look at what we're doing there?

Convergence layer

In different CNN architectures, there are many types of pooling layers, but their purpose is to gradually reduce the range of network space, thus reducing network parameters and overall calculation. The pool type used in the Tiny VGG schema above is Max-Pooling.

The maximum pool operation requires the selection of kernel size and stride during architecture design. Once selected, the operation slides the kernel on the input at the specified stride, while selecting only the maximum value on each kernel slice from the input to produce the output value. You can view this process by clicking the merged neuron in the network above.

In the above Tiny VGG architecture, the pooling layer uses the 2x2 kernel with a stride of 2. Doing this using these specifications will result in 75% of the activation being discarded. By discarding so many values, Tiny VGG is more efficient and avoids overfitting.

Flattened layer

This layer converts the three-dimensional layer in the network into one-dimensional vector to fit the input of the fully connected layer for classification. For example, convert a 5x5x2 tensor to a vector of size 50. The previous convolution layer of the network extracted features from the input image, but now it is time to classify these features. We use the softmax function to classify these functions, which requires one-dimensional input. This is why a flat layer is needed. You can view this layer by clicking any output class.

Interactive function

Upload your own image by selecting the upload picture icon to see how your image is divided into 10 categories. By analyzing the neurons in the entire network, you can understand the activation diagram and the extracted features.

Change the color stop of the active graph to better understand the influence heat map of activation at different levels of abstraction by adjusting.

Click the network details icon icon to learn about network details, such as layer sizes and color stops.

Interact with layer slices in Interactive Formula View by simulating network operations by clicking the play icon button, or by hovering over parts of the input or output to understand mapping and basic operations.

Learn more about the layer features in the article by clicking the information icon Interactive Formula View.

Film teaching

Introduction to the CNN interpreter (0:00-0:22)

Overview (0:27-0:37)

Convolution elastic interpretation view (0:37-0:46)

Convolution, ReLU and pooled Interactive Formula View (0:46-1:21)

Flatten the elastic description view (1:22-1:41)

Softmax Interactive Formula View (1:41-2:02)

Attracting Learning experience: understanding Classification (2:06-2:28)

Interactive tutorial articles (2:29-2:54)

How is the CNN interpreter implemented?

CNN Explainer uses TensorFlow.js, a deep learning library accelerated by GPU in the browser, to load pre-trained models for visualization. The whole interactive system uses Svelte as the framework and uses D3.js for visualization, and is written in Javascript. You only need a web browser to start learning CNN immediately!

Who developed CNN Explainer?

The CNN interpreter was created by Jay Wang, Robert Turko, Omar Shaikh, Haekyu Park, Nilaksh Das, Fred Hohman, Minsuk Kahng and Polo Chau as a result of a research collaboration between Georgia Tech and Oregon State University. We thank Anmol Chhabria,Kaan Sancak,Kantwon Rogers and the Georgia Technology Visualization Lab for their support and constructive feedback. NSF grant IIS-1563816,CNS-1704701,NASA NSTRF,DARPA GARD and gifts from Intel, NVIDIA,Google and Amazon all partially support this work.

Interpreter learning notes

CNN Neural Network Visualization tool 2

The online network tool provides more than 10 classic network visualization files, such as AlexNet, GoogLeNet, YOLO, ResNet series and Inception series. You can clearly see the parameters of each layer.

Https://dgschwend.github.io/netscope/quickstart.html

The above is how to achieve CNN visualization on the browser shared by the editor. If you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.