What is the principle of PointCNN 10/16 Update SLTechnology News&Howtos

What is the principle of PointCNN

2025-10-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the relevant knowledge of what the PointCNN principle is, the content is detailed and easy to understand, the operation is simple and fast, and has a certain reference value. I believe you will gain something after reading this PointCNN principle article. Let's take a look at it.

The motivation for PointCNN is as follows:

In the figure, the lowercase f indicates the characteristics of the point, and the f with the same footer indicates the same characteristics of the corresponding point. the author tells us that there are two problems in the picture:

First of all, suppose that graph ii and graph iii are two kinds of objects, but we arrange them in a certain order and get a sequence of features (fa,fb,fc,fd). It happens that the characteristics of these two objects are still the same, so these two different objects are judged to be the same object. Secondly, in another case, graph iii and graph iv are originally the same object from the point of view of shape, but because the order is not appropriate, the characteristic sequence of graph iii is (fa,fb,fc,fd), and graph iv is (fc,fa,fb,fd), which is sorry, you two are different.

The author thinks that point cloud recognition can not be done well if these two problems are not solved. So, he thought of a way to train a transformation, after the transformation, so that the characteristics of ii and iii are different, so that iii and iv have the same characteristics. How can this transformation be obtained? Train with a deep network. The author named it X-transformation.

Then, the author thinks again that the convolution operation in the field of image processing is carried out on the domain, and this idea can also be used in the point cloud, as long as you find the domain first. That's how it came true in the end.

Code interpretation

Here is the code section:

P.S. The author's Github is well maintained, updated frequently, and answered readers' questions in a timely manner.

Students who have seen the code know that the author's core idea of X transformation is in the xconv of the code pointcnn.py. According to the algorithm flow, this part of the code can be divided into two parts: 'feature extraction' and'X matrix training'. Let's talk about it separately.

There are only two dense layers (also known as fc layer / MLP) used to extract neighborhood features, which simply upgrade the neighborhood structure with the scale of (PMague KMague C) to (PMagee KMague C').

# Prepare featuresto be transformed

Nn_fts_from_pts_0 = pf.dense (nn_pts_local_bn, C_pts_fts, tag + 'nn_fts_from_pts_0',is_training) # fc1, (N, P, K, C_pts_fts) nn_fts_from_pts = pf.dense (nn_fts_from_pts_0, C_pts_fts, tag +' nn_fts_from_pt',is_training) # fc2, features F_delta if fts isNone: nn_fts_input = nn_fts_from_pts else: nn_fts_from_prev = tf.gather_nd (fts,indices, name=tag + 'nn_fts_from_prev') nn_fts_input = tf.concat ([nn_fts_from_pts,nn_fts_from_prev], axis=-1, name=tag +' nn_fts_input')

Then the author does not use maxpooling in pointnet, which is also mentioned in the paper, because the author thinks that training an X transformation can achieve better results.

As for the X transformation, there are actually three layers. The first layer is the convolution layer, which is very surprising. The convolution kernel is 1 × k, that is, in the neighborhood dimension, k neighborhood points are gathered into one point directly, and K × K convolution layers are used to raise the feature dimension to karmk, and the dimension has changed from (Ppdre Kpar C) to (PPPdR 1L K × K).

If with_X_transformation: # X-transformation # Xroom0 = pf.conv2d (nn_pts_local, K * K, tag + 'Xroom0, is_training, (1, K)) X_0_KK = tf.reshape (Xtun0, (N, P, K) Is_training, (1, K)) X_1_KK = tf.reshape (X_0_KK 1, (N, P, K, K), name=tag + 'Xue1roomKK') Xroom2 = pf.depthwise_conv2d (X_1_KK, K, tag +' Xanti2), is_training, (1) K), activation=None) X_2_KK = tf.reshape (Xuan 2, (N, P, K, K), name=tag + 'Xue2roomKK') fts_X = tf.matmul (X_2_KK, nn_fts_input, name=tag +' fts_X')

Then, the author uses two dense layers to maintain this structure

Xerox 1 = pf.dense (Xue0J K * K, tag + 'Xerox 1, is_training, with_bn=False) # in the center point dimensional,P decrease to 1.

Xerox 2 = pf.dense (Xray 1 Magi K * K, tag + 'Xuan 2, is_training, with_bn=False, activation=None) # (N, P, 1 Magi K * K)

Then reshape becomes (Pmam KMIT K), and then we get the X-transporm matrix.

X = tf.reshape (xylene 2, (N, P, K, K), name=tag +'X')

Multiply the convolution with the feature graph obtained above.

Fts_X = tf.matmul (X, nn_fts_input, name=tag + 'fts_X')

Finally, there is the convolution operation, the convolution kernel with C, (1, K) scale is used to fuse the K-neighborhood, which is different from the pooling operation in Pointnet++.

Fts = pf.separable_conv2d (fts_X, C, tag + 'fts', is_training, (1, K), depth_multiplier=depth_multiplier) # output (N, P, 1, C)

Returntf.squeeze (fts, axis=2, name=tag + 'fts_3d') # output (N, P, C)

Note that instead of ordinary convolution, we use separable convolution separable_conv2d.

This is the end of the article on "what is the principle of PointCNN?" Thank you for reading! I believe you all have a certain understanding of "what is the principle of PointCNN". If you want to learn more, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.