Rainbow soft face recognition 3.0-introduction to Image data structure (Android) 04/16 Update SLTechnology News&Howtos

Rainbow soft face recognition 3.0-introduction to Image data structure (Android)

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Since Hongsoft opened version 2.0 SDK, because of its free and offline use, our company has used Rainbow soft SDK in face recognition access control applications, and the recognition effect is good, so we pay more attention to the official developments of Rainbow soft SDK. Recently, ArcFace 3.0 SDK version has been launched, and a big update has indeed been made. First of all, this article introduces the updates about the algorithms on the Android platform.

Feature comparison supports model selection, and the recognition rate and anti-attack effect of life photo comparison model and witness comparison model are significantly improved. After upgrading, the face database needs to re-register for Android platform to add 64-bit SDK image processing tools for face detection, which supports both full-angle and single-angle face detection.

It is difficult to use the new image data structure in the actual development process. This paper will introduce the image data structure and usage in detail from the following points.

SDK interface change

ArcSoftImageInfo class parsing

Parsing related codes of SDK

The role of step size

Convert Image returned from Camera2 to ArcSoftImageInfo

I. change of SDK interface

When connecting to SDK version 3.0, it is found that the functions of detectFaces, process, extractFaceFeature and other input image data in FaceEngine class all have overloaded functions. The interfaces of overloaded functions all use ArcSoftImageInfo object as the input parameter image data. Take face detection as an example, the specific APIs are as follows:

Original interface:

Public int detectFaces (byte [] data, int width, int height, int format, List faceInfoList)

New API:

Public int detectFaces (ArcSoftImageInfo arcSoftImageInfo, List faceInfoList)

As you can see, the overload function is passed into the ArcSoftImageInfo object for detection as image data, and arcSoftImageInfo replaces the original data, width, height, format.

Second, the analysis of ArcSoftImageInfo class

After my actual use, I found that ArcSoftImageInfo is not just a simple encapsulation, it also modifies an one-dimensional array data to a two-dimensional array planes, and adds a step array strides corresponding to planes.

Introduction to the concept of step size:

The step size can be understood as the number of bytes in a row of pixels.

The class structure is as follows:

Public class ArcSoftImageInfo {private int width; private int height; private int imageFormat; private byte [] [] planes; private int [] strides;...}

An introduction to this class in the official documentation:

Member description type variable name describes intwidth image width intheight image height intimageFormat image format byte [] [] planes image channel int [] strides step size composition of each image channel introduction / / arcSoftImageInfo composition example: / / NV21 format data, there are two channels, / / Y channel step size is generally the image width, if the image through 8-byte alignment, 16-byte alignment and other operations The aligned image step / / VU channel step is generally the image width. If the image goes through 8-byte alignment or 16-byte alignment, you need to enter the aligned image step ArcSoftImageInfo arcSoftImageInfo = new ArcSoftImageInfo (width, height, FaceEngine.CP_PAF_NV21, new byte [] [] {planeY, planeVU}, new int [] {yStride, vuStride}). / / GRAY, there is only one channel. / / the step size is generally the width of the image. If the image goes through operations such as 8-byte alignment, 16-byte alignment, etc., you need to enter the aligned image step arcSoftImageInfo = new ArcSoftImageInfo (width, height, FaceEngine.CP_PAF_GRAY, new byte [] [] {gray}, new int [] {grayStride}). / / BGR24, there is only one channel. / / the step size is generally three times the width of the image. If the image goes through operations such as 8-byte alignment, 16-byte alignment, etc., you need to enter the aligned image step arcSoftImageInfo = new ArcSoftImageInfo (width, height, FaceEngine.CP_PAF_BGR24, new byte [] [] {bgr24}, new int [] {bgr24Stride}). / / DEPTH_U16, there is only one channel, and / / the step size is generally twice the width of the image. If the image goes through operations such as 8-byte alignment, 16-byte alignment, etc., you need to enter the aligned image step arcSoftImageInfo = new ArcSoftImageInfo (width, height, FaceEngine.CP_PAF_DEPTH_U16, new byte [] [] {depthU16}, new int [] {depthU16Stride})

As you can see, ArcSoftImageInfo is used to store separate image data. Take NV21 data as an example, NV21 data has two channels, and the two-dimensional array planes stores two arrays: y array and vu array. Here is how the NV21 data is arranged:

NV21 image format belongs to YUV420SP format in YUV color space. Every four Y components share a set of U and V components, Y continuous storage and U and V cross storage.

The arrangement is as follows (take the image of 8x4 as an example):

Y Y Y Y Y Y Y Y

V U V U

The above data is divided into two channels, first continuous Y data, and then cross-stored V and U data. If we are using Camera API, we basically don't need the ArcSoftImageInfo class, because the NV21 data sent back by Camera API is continuous and can be directly used by the old interface. When we use other API, the data we get may be discontinuous. For example, the image data of android.media.Image objects obtained by Camera2 API and MediaCodec are also divided into channels. We can obtain Y channel data and VU channel data according to their channel content, and form ArcSoftImageInfo objects in NV21 format for processing.

III. Parsing of SDK-related codes

Let's take a look at the check code in SDK to determine whether the image data is legal:

Note: due to the original code has been modified by the compiler, the reading experience is not good, the following code is modified, the constant value will be replaced back to the constant name, more easy to read.

Check separated image information data

Private static boolean isImageDataValid (byte [] data, int width, int height Int format) {return (format = = CP_PAF_NV21 & & (height & 1) = = 0 & & data.length = = width * height * 3 / 2) | (format = = CP_PAF_BGR24 & & data.length = = width * height * 3) | (format = = CP_PAF_GRAY & data.length = = width * height) | | (format = = CP_PAF_DEPTH_U16 & & data.length = = width * height * 2) }

Interpretation:

The requirements for each image data are as follows:

The height of NV21 format image data is even, and the data size is: width x height x3/2BGR24 format image data size is: width x height x3GRAY format image data size is: width x height

The size of image data in DEPTH_U16 format is: width x height x2

Check ArcSoftImageInfo object

Private static boolean isImageDataValid (ArcSoftImageInfo arcSoftImageInfo) {byte [] [] planes = arcSoftImageInfo.getPlanes (); int [] strides = arcSoftImageInfo.getStrides (); if (planes! = null & & strides! = null) {if (planes.length! = strides.length) {return false;} else {byte [] [] var3 = planes Int var4 = planes.length; for (int var5 = 0; var5 < var4; + + var5) {byte [] plane = var3 [var5]; if (plane = = null | | plane.length = = 0) {return false }} switch (arcSoftImageInfo.getImageFormat ()) {case CP_PAF_BGR24: case CP_PAF_GRAY: case CP_PAF_DEPTH_U16: return planes.length = = 1 & & planes [0] .length = = arcSoftImageInfo.getStrides () [0] * arcSoftImageInfo.getHeight () Case CP_PAF_NV21: return (arcSoftImageInfo.getHeight () & 1) = = 0 & & planes.length = = 2 & & planes [0] .length = = planes [1] .length * 2 & planes [0] .length = = arcSoftImageInfo.getStrides () [0] * arcSoftImageInfo.getHeight () & & planes [1] .length = = arcSoftImageInfo.getStrides () [1] * arcSoftImageInfo.getHeight () / 2 Default: return false;} else {return false;}}

Interpretation:

The size of each channel data is: height x step size of each channel BGR24, GRAY, DEPTH_U16 format image data have only one channel, but the steps mentioned in the above examples are different, and the relationship is as follows: BGR24 format image data step size is generally 3 x widthGRAY format image data step size is generally widthDEPTH_U16 format image data step size is generally 2 x widthNV21 format image data step size is even, there are two channels And the data size of the 0th channel is twice that of the first channel. Fourth, the function of step size

Specific examples of stepping on the pit

As shown in the following figure, this is the data obtained when previewing at 1520x760 resolution is specified when using Camera2 API on a mobile phone. Although the specified resolution is 1520x760, the actual size of the preview data is 1536x760. After analyzing the saved image data, it is found that the 16 pixels filled on the right are all 0. At this time, if we use the resolution of 1520x760 to take out this set of YUV data and convert them into NV21, and the width passed in during face detection is 1520, SDK will not be able to detect human faces. If we use the resolution of 1536x760 to parse, the generated NV21 is passed to SDK, and when the incoming width is 1536, SDK can detect faces.

The importance of step size

With only a few pixels missing, why can't the face be detected? As mentioned earlier, the step size can be understood as the number of bytes of a row of pixels. If there is a deviation in the reading of the first row of pixels, the reading of subsequent pixels will also be affected.

The following is the result of parsing a piece of NV21 image data of 1000x554 in different steps:

Parse with the correct step and parse with the wrong step

As you can see, for an image, if we use the wrong step size to parse, we may not be able to see the correct image content.

Conclusion: the problem of high byte alignment can be effectively avoided by introducing image step size.

5. Convert the Image returned by Camera2 to ArcSoftImageInfo

Camera2 API backhaul data processing

For the above scenarios, we can extract the `Y`, `U` and `V` channel data of the `android.media.Image` object, form the `ArcSoftImageInfo` object in the format of `NV21`, and input it to SDK for processing. The sample code is as follows:

Take out the Y, U, V channel data of the Camera2 API return data

Private class OnImageAvailableListenerImpl implements ImageReader.OnImageAvailableListener {private byte [] y; private byte [] u; private byte [] v; @ Override public void onImageAvailable (ImageReader reader) {Image image = reader.acquireNextImage () / / the actual result is generally Y:U:V = = 4:2:2 if (camera2Listener! = null & & image.getFormat () = = ImageFormat.YUV_420_888) {Image.Plane [] planes = image.getPlanes () / / reuse the same batch of byte arrays and reduce the gc frequency if (y = = null) {y = new byte [planes [0] .getBuffer (). Limit ()-planes [0] .getBuffer ()] U = new byte [planes [1] .getBuffer (). Limit ()-planes [1] .getBuffer (). Position ()]; v = new byte [planes [2] .getBuffer (). Limit ()-planes [2] .getBuffer (). GetBuffer ()] } if (image.getPlanes () [0] .getBuffer (). Remaining () = = y.length) {planes [0] .getBuffer () .get (y); planes [1] .getBuffer () .getBuffer (u) Planes [2] .getBuffer () .get (v); camera2Listener.onPreview (y, u, v, mPreviewSize, planes [0] .getRowStride ());}} image.close ();}} convert to ArcSoftImageInfo object

Note: the YUV data you get may be YUV422 or YUV420, and you need to implement functions that convert both to ArcSoftImageInfo objects in NV21 format.

@ Override public void onPreview (final byte [] y, final byte [] u, final byte [] v, final Size previewSize, final int stride) {if (arcSoftImageInfo = = null) {arcSoftImageInfo = new ArcSoftImageInfo (previewSize.getWidth (), previewSize.getHeight (), FaceEngine.CP_PAF_NV21) } / / the returned data is YUV422 if (y.length / u.length = = 2) {ImageUtil.yuv422ToNv21ImageInfo (y, u, v, arcSoftImageInfo, stride, previewSize.getHeight ()) } / / the returned data is YUV420 else if (y.length / u.length = = 4) {ImageUtil.yuv420ToNv21ImageInfo (y, u, v, arcSoftImageInfo, stride, previewSize.getHeight ()) } / / the arcSoftImageInfo data can be passed to SDK using if (faceEngine! = null) {List faceInfoList = new ArrayList (); int code = faceEngine.detectFaces (arcSoftImageInfo, faceInfoList) If (code = = ErrorInfo.MOK) {Log.i (TAG, "onPreview:" + code + "" + faceInfoList.size ());} else {Log.i (TAG, "onPreview: no face detected, code is:" + code) }} else {Log.e (TAG, "onPreview: faceEngine is null"); return;}...}

The above code is the specific implementation of converting the data returned by Camera2 API into ArcSoftImageInfo objects and detecting them. The following is a concrete implementation of composing Y, U, V data into ArcSoftImageInfo objects.

Compose Y, U, V data into ArcSoftImageInfo objects

For Y channel, you can copy it directly. For U channel and V channel, you need to consider whether the format of this group of YUV data is YUV420 or YUV422, and then get the U and V data in it.

/ * YUV420 data converted to ArcSoftImageInfo * @ param y YUV420 data in NV21 format * @ param u YUV420 data u component * @ param v YUV420 data v component * @ param arcSoftImageInfo NV21 format ArcSoftImageInfo * Step size of @ param stride y component In general, due to the correspondence of YUV data, the Y component step size is determined. U and V also determine * @ param height image height * / public static void yuv420ToNv21ImageInfo (byte [] y, byte [] u, byte [] v, ArcSoftImageInfo arcSoftImageInfo, int stride, int height) {if (arcSoftImageInfo.getPlanes () = = null) {arcSoftImageInfo.setPlanes (new byte [] [] {new byte [stride * height], new byte [stride * height / 2]}) ArcSoftImageInfo.setStrides (new int [] {stride, stride});} System.arraycopy (y, 0, arcSoftImageInfo.getPlanes () [0], 0, y.length) / / Note: vuLength cannot be calculated directly through step size and height. The data returned by Camera2 API is found to be lost in actual measurement, so the real data length byte [] vu = arcSoftImageInfo.getPlanes () [1]; int vuLength = u.length / 2 + v.length / 2; int uIndex = 0, vIndex = 0; for (int I = 0; I < vuLength) is required. Vu +) {vu [I] = v [vIndex++]; vu [I + 1] = u [uIndex++] }} / * YUV422 data is converted into y component of ArcSoftImageInfo * * @ param y YUV422 data in NV21 format * @ u component of param u YUV422 data * @ param v YUV422 data v component Step size of ArcSoftImageInfo * @ param stride y component in * @ param arcSoftImageInfo NV21 format In general, due to the correspondence of YUV data, the Y component step size is determined. U and V also determine * @ param height image height * / public static void yuv422ToNv21ImageInfo (byte [] y, byte [] u, byte [] v, ArcSoftImageInfo arcSoftImageInfo, int stride, int height) {if (arcSoftImageInfo.getPlanes () = = null) {arcSoftImageInfo.setPlanes (new byte [] [] {new byte [stride * height], new byte [stride * height / 2]}) ArcSoftImageInfo.setStrides (new int [] {stride, stride});} System.arraycopy (y, 0, arcSoftImageInfo.getPlanes () [0], 0, y.length); byte [] vu = arcSoftImageInfo.getPlanes () [1] / / Note: vuLength cannot be calculated directly through step size and height. It is found that the data returned by Camera2 API is missing, and the real data length int vuLength = u.length / 2 + v.length / 2; int uIndex = 0, vIndex = 0; for (int I = 0; I < vuLength; I + = 2) {vu [I] = v [vIndex] Vu [I + 1] = u [uIndex]; vIndex + = 2; uIndex + = 2;}. Summary of the advantages of ArcSoftImageInfo. When the acquired image data source is sub-channel data, using ArcSoftImageInfo objects to pass in the separated image data can avoid the extra memory consumption required for data stitching. The concept of step size is introduced, and the step size of each channel is introduced when using it, so that developers can have a clearer understanding of the image data when using SDK.

Android Demo can be downloaded from the Rainbow soft face recognition Open platform

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.