In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/03 Report--
Introduction to Hayes Hi3519AV100, a powerful low-power embedded AI solution
Hayes Hi3519AV100 first entered our line of sight at the Beijing Security Exhibition in October 2018. then in early November, a company in Beijing commissioned us to do some research and get the initial version of SDK from the agent. I carefully looked at the chip DATASHEET. At that time, I found that it was stronger than the Hi3519V101+Intel Movidius Myriad 2 MA2450, which the company mainly promoted, so I decided to do this chip scheme. in order to reduce the risk, we could only play with our own V1.0 version development board. It was only when software engineers and hardware engineers debugged SDK software that we agreed to Beijing customers to customize the Hi3519AV100 project. When the prototype came back at the end of January 2019, when we were able to run Hi3519AV100's SDK program, we immediately started customizing Hi3519AV100 products for Beijing customers. On the contrary, our external sales development board did not come out so quickly, and customer customization was given priority. The company did not sell the development board of Hi3519AV100 program until March 2019, and there is not much time to take care of it here. Last year, I wrote "Image recognition VPU-- easy-to-use embedded AI support Deep Learning platform introduction" which introduced Hi3559A, a more powerful embedded AI platform, but since Hayes does not open SDK resources to small companies, we do not need to introduce it here in detail.
When the time comes to June 2019, I found that many embedded AI solutions with moderate price and low power consumption were launched in the first half of this year, which are more representative:
JETSON NANO of Nvidia
The simplified version of JETSON TX1, pay attention to the simplification of TX1, the core board reduces the price to less than 1000 yuan, the performance is quite good, but when we see that heat sink, we don't think about it. Of course, if your product is used indoors, this NANO is quite good, because the powerful performance is placed there.
GOOGLE's CORAL USB (TPU) acceleration Rod
(the TPU performance of GOOGLE's CORAL USB is also quite strong, we haven't played it yet, it's hard to comment, and it's also like Intel Movidius NCS accelerator stick mode.)
Ruixin micro RK3399PRO, RK1808 accelerator rod
(RK3399PRO's AI core is quite similar to Hi3519AV100 NNIE's AI. ARM's performance RK3399PRO is stronger than Hi3519V100, and its power consumption is more powerful than Hi3519AV100. We still don't dare to use it outdoors to make products for our customers. RK1808 accelerator bar is also like Intel Movidius NCS accelerator bar mode.)
Baidu's EdgeBoard based on Cyrus Zynq (more expensive)
There is also a RISC-V+KPU low-cost solution (super cheap)
Plus last year's main launch of Intel Movidius Myriad X MA2485 (only released this year to support raspberry pie OpenVINO development kit, I only last month on the raspberry pie 3 and RK3288 platform to run through license plate recognition and face recognition examples, based on the l_openvino_toolkit_raspbi_p_2019.1.094 package, our company VPU module board is doing)
To tell you the truth, our small company's energy is also limited, there are ready-made customer customization, do the corresponding chip program, other do not have time to struggle, and Hi3519AV100 hardware and software information SDK is relatively easy to get. Before we introduce Hi3519AV100 in more detail, let's take a hint here:
The naming of Hayes chip is a headache (a lot of customer feedback). In the field of security surveillance IPC network camera alone, the performance of Hi3519AV100, Hi3519V101 and Hi3519V100 is very far away. We can see in figure-2 that Hi3519AV100 is not only dual-core CORTEX-A53+IVE, but also increases NNIE and DSP that support deep learning. Hi3519V101 is just a combination of CORTEX-A17+CORTEX-A7+IVE, without an acceleration engine that supports deep learning, so the Hi3519V101 + VPU launched by our company last year requires an additional Intel Movidius Myriad 2 MA2450 VPU chip to run the Caffe deep learning algorithm (it must be a Cmax Cure + program), while the worst Hi3519V100 has been discontinued, and the three Hi3519 chip models have different processes, so their performance is different. They are all called Hi3519, but the letters and numbers of the suffix are different, the performance is very different, and many customers are confused. In making IPC (network camera) products, Hayes also has a very cost-effective model called Hi3516, which has at least 9 suffixes, and its performance varies greatly. Here, see the official website of Hayes: http://www.hisilicon.com/en/Products/ProductList/Surveillance. Other models used as NVR-DVR- set-top boxes-mobile phones and so on are not described here, we only focus on the platform for image recognition.
Hi3519AV100 We can look at the Hi3519AV100 application block diagram from figure-1 below.
Figure-1 Hi3519AV100 application block diagram
Hi3519AV100 is a 4K Ultra HD Mobile Camera SOC (TSMC 12nm process) with high performance and low power consumption for surveillance IP cameras, motion cameras, panoramic cameras, rearview mirrors, aerial drones, binocular robots and other product fields.
Hi3519AV100 MIPI interface 4-LANE mode can connect 3 CMOS SENSOR at the same time, 2-LANE mode can connect 5 CMOS SENSOR simultaneously. Built-in high-performance panoramic splicing engine, can achieve 4K-level 2x4 real-time video panoramic stitching. The main camera ISP0 supports 4K x 2K (3840 x 2160) @ 30fps encoding, which is used for SD card storage + 1080p@30fps substream coding and sent to the wireless transmission module through the network port or USB. The ISP1/2 is connected to the binocular camera, and the binocular depth map is extracted by the DPU module and sent to the flight control MCU for obstacle avoidance. ISP3/4 connects the monocular or binocular camera facing down, runs the SLAM algorithm on the vector DSP, and sends the results to the flight control MCU for hovering.
NNIE (Cambrian AI kernel) is a powerful programmable neural network inference engine used to run face recognition / detection, target detection or gesture recognition algorithms.
Figure-2 Internal structure diagram of Hi3519AV100 chip
From figure-2 above, we can see that Hi3519AV100 is still relatively strong:
1. 2*ARM Cortex A531GHz 1.5GHz cache 32KB I copyright cached32KB Dcache / 256KB L2
2. Support Neon acceleration and integrate FPU processing unit
3. DSP integrates Tensilica Vision P6 DSP@630MHz,32KB I-Cache/32KB I-RAM/512KB Data RAM, 0.3Tops neural network operation performance, and supports Huawei LiteOS
4. Support DDR4
The worst chips for image recognition now have to be equipped with DDR3 memory, while Hi3519AV100 is an external DDR4 memory chip, and the speed of deep learning to access memory is a key indicator, DDR3 is still relatively backward.
5 、 NNIE
Support AlexNet, VGG, ResNet, GoogLeNet and other classification neural networks
Support multiple target detection neural networks such as Faster R-CNN, SSD, YoloV2, etc.
Operational performance of 2.0Tops neural network
Support complete API and tool chain (compiler, simulator), easy to adapt to customer customized network
(friends who have done deep learning algorithms are very familiar with the application of the algorithms listed above, hehe)
The NNIE core integrated on the Hi3519AV100 board supports deep learning:
Acceleration engine NNIE is a special accelerator for deep learning based on CNN, RCNN and other neural network structures.
It is used in image classification, target detection and other application scenarios.
NNIE performance:
Support deep learning algorithms AlexNet, VGG, ResNet, GoogLeNet and other classification neural networks
Multiple target detection neural networks such as Faster R-CNN, SSD, YoloV2, etc., based on deep learning algorithms.
Operational performance of 2.0Tops neural network
Support complete API and tool chain (compiler, simulator), easy to adapt to customer customized network
NNIE acceleration engine features are as follows:
Support for N * N convolution
Support for Pooling (Max and Average)
Support for Stride
Support for Pad
Support for activation functions (Relu, Sigmoid, and TanH)
Support LRN operation
Support for BN (Batch Normalization)
Support vector and matrix multiplication and addition (Inner Product)
Support for Concat
Support for Eltwise
Data and parameter mode supporting 8bit
Support data and parameter bit width configurability
Support for parameter compression and parameter sparse
Support input image for single channel (grayscale image) and three channels (RGB format)
Support for image preprocessing (averaging and pixel scaling)
Support for image batch processing
Support for middle-tier result reporting
Under the path of the Hayes SDK package: Hi3519AV100R001C02SPC010\ ReleaseDoc\ zh\ 01.software\ board\ SVP, there is a "HiSVP Development Guide .pdf" and "HiSVP API reference .pdf"
(note: these materials must not be released casually to the public. Only when purchasing Huawei Hayes Hi3519AV100 chips is authorized by the agent can they be made public within the company, so I also abide by certain principles.)
Figure 3 introduction to Hi3519AV100 NNIE 1
Figure-4 introduction to Hi3519AV100 NNIE 2
Figure-5 Hi3519AV100 NNIE development process
From the above three pictures, we can see some knowledge of deep learning algorithms. Deep learning mode training and development are all processed on the PC or with the help of the cloud, while the final program is run on the SoC Hi3519AV100 board. In short, the content is very large, so we won't move it here.
Integrated vector DSP on Hi3519AV100 board:
Vector DSP (Tensilica Vision P6 DSP@630MHz, 0.3Tops neural network computing performance) is a special processor for visual processing, which has the ability of programming. Based on DSP, it can not only develop a series of basic operation functions for intelligent analysis algorithms, but also realize complex algorithms. Hi3519AV100 supports 1 vector DSP.
DSP has the following main specification points:
Support scalar fixed-point and floating-point operations
Support vector fixed-point and (single-precision) floating-point operations
Support histogram statistical acceleration
Support for Gather/Scatter operation
ICache size is 32KB, but data DCache is not supported.
Support for 32KB-sized IRAM
Support for DRAM,DRAM0 and DRAM1 256KB with a total 512KB size
Support for 18 levels to trigger interrupts
Support for Input Queue and Output Queue
Support for built-in IDMA to exchange data between on-chip DRAM and DDR
JTAG debugging is supported.
DPU is integrated on the Hi3519AV100 board:
DPU (Depth Process Unit) calculates the depth map by correcting and matching the input left image and right image.
DPU has the following main specification points:
Support for correction and matching, which can be used at the same time or separately.
Support for simultaneous correction of left and right images
Supports a maximum resolution of 1080p
Supports a maximum search parallax of 224
Support for configurable starting parallax with a range of 0x64
Only single component input is supported
Support sub-pixel depth map output
Only the right image is supported as the reference image and the left image as the search image
Matching supports that the width of the left image is greater than that of the right image.
Correction supports that the resolution of the left image is different from that of the right image, and the input and output resolution is different.
IVE is integrated on the Hi3519AV100 board:
IVE (Intelligent Video Engine) module provides a series of basic operation functions used in intelligent analysis algorithm, as well as some time-consuming special functions. It is a hardware acceleration module in intelligent analysis system. It supports IVE 2.1intelligent operator and supports hardware acceleration of many operators such as feature point detection, optical flow, computer morphology processing and so on.
The IVE module supports the following features:
DMA: supports direct copy, interval copy, memory filling.
Filter: supports 5x5 template filtering.
CSC: support YUV2RGB, YUV2HSV, YUV2LAB, RGB2YUV color space conversion.
FilterAndCSC: supports the composite function of 5x5 template filtering and CSC.
Sobel: 5x5 template Sobel-like gradient calculation is supported.
MagAndAng\ Canny: support 5x5 template gradient amplitude and angle calculation, Canny edge extraction.
Erode: support 5x5 template etching.
Dilate: support 5x5 template expansion.
Thresh\ Thresh_S16\ Thresh_U16: supports image thresholding.
And\ Or\ Xor: two images are supported to be XOR, XOR or XOR.
Add\ Sub: support weighted addition and subtraction of two images.
Integ: support integral graph calculation.
Hist: histogram statistics are supported.
Map: supports assigning values to images through 256level map mapping.
16BitTo8Bit: supports linear conversion from 16bit data to 8bit data.
OrdStatFilter: support sequential statistics filtering: median filtering, maximum filtering, minimum filtering.
NCC: supports the calculation of the correlation number of two images of the same size.
CCL: connected area tags are supported.
GMM: supports mixed Gaussian background modeling of grayscale images and RGB images.
LBP: supports simple local binary mode calculation.
NormGrad: supports normalized gradient calculation.
LKOpticalFlow: LK optical flow tracking is supported.
STCorner: ShiTomasi corner detection is supported.
GradFg: supports gradient foreground operation.
MatchBgModel\ UpdateBgModel: background matching and background update are supported.
ANN_MLP_Predict: ANN_MLP prediction is supported.
SVM_Predict: SVM prediction is supported.
SAD: it is supported to calculate the sum of the absolute value of pixel difference between two images in blocks.
Resize: supports bilinear, regional image scaling.
GMM2: supports fast mixed Gaussian background modeling of grayscale images and RGB images.
CNN_Predict: supports convolution neural network computing.
A separate soft reset is supported.
Support SP400, SP420 (semi-plannar 420), SP422 (semi-plannar 422), package, planar
Wait for the input format.
Support SP400, SP420, SP422, package, plannar and other output formats.
Some operators support non-16-byte alignment of read and write addresses.
Under the path of the Hayes SDK package: Hi3519AV100R001C02SPC010\ ReleaseDoc\ zh\ 01.software\ board\ SVP, there is a "HiIVE API reference .pdf"; decompress the HiIVE_PC_V2.1.0.7_64bit.tar.gz in SVP_PC.rar and get the "HiIVE tool usage Guide .pdf".
(note: these materials must not be released casually to the public. Only when purchasing Huawei Hayes Hi3519AV100 chips is authorized by the agent can they be made public within the company, so I also abide by certain principles.)
Introduction to Hi3519AV100 video input API:
Support 12-lane Image Sensor serial input, support MIPI/subLVDS/HiSPI/SLVS-EC multiple interfaces
Up to 5 Sensor serial inputs are supported
Support multiple combination methods such as 12-lane/8-lane+4-lane/4-lane+4x2-lane
Maximum input resolution: 7680x4320
Support 10-12-14 bit Bayer RGB DC sequential video input; support BT.656 and BT.1120 video input
The input of 1x 4-way YUV through MIPI virtual channel is supported.
Introduction to Hi3519AV100 ISP and image processing:
ISP supports time division multiplexing and can handle multi-channel sensor input video.
Support 3A (AE/AWB/AF) function, and 3A parameters can be adjusted by users.
Support for de-fixed pattern noise (FPN)
Supports two-frame exposure WDR and Local Tone Mapping, and supports strong light suppression and backlight compensation
Support bad point correction and lens shadow correction
Support multi-level 3D denoising, provide excellent low-illumination image effects, and remove motion trailing and color noise
Support 3D-LUT color adjustment
Support image dynamic contrast enhancement and edge enhancement processing
Support chromatic aberration correction (CAC) and purple edge removal
Support defogging
Support 6-Dof digital anti-shake and Rolling-shutter correction
Support geometric correction of lens distortion and fisheye correction
Supports 90 degree / 270 degree rotation of the image
Support for image Mirror and Flip
Support for multiple scaling output, zoom multiple: 1Compare 15.5mm 16x
Support for pre-coding OSD stack with up to 8 regions
Provide PC-side ISP adjustment tool.
Introduction to other functions of Hi3519AV100
Support for H.264/H.265 encoding and decoding
Support for HDMI2.0 output interface
Support for video stitching hardware acceleration engine
Support audio interface, integrate Audio codec, support 16bit voice input and output
Support for audio encoding and decoding
Support 1000m network port
Hi3519AV100 power consumption
TSMC 12nm process; (this 12nm process is really good)
Typical scenario (4K x 2K (3840 x 2160) @ 30fps coding + neural network algorithm) power consumption: 1.9W
Support multi-level power saving mode
Maximum power consumption scenario: ambient temperature 70 °, junction temperature 110 °, running 4K + divine network and other modules are all turned on: the power consumption is 2.9W, but this rarely happens.
Hi3519AV100 operating temperature
Temperature range:-25 °C-70 °C, commercial grade, the actual test results of our company's high and low temperature.
Hi3519AV100 SDK package
Hi3519AV100's SDK package needs to be installed in the ubuntu16.04 LST 64bit environment, so it's not cumbersome here. There are a few simple commands in the SDK package that describe how to install SDK. However, here is a reminder to friends who have never installed the Hayes SDK package, due to the ubuntu16.04 LST 64bit environment, the following commands need to be done:
Log in to ubuntu using root
1 、. / sdk.unpack: run_command_progress_float: not found
# dpkg-reconfigure dash, select no
2 、 / bin/sh: / opt/hisi-linux/x86-arm/arm-himix200-linux/bin/arm-himix200-linux-size: No such file or directory
# apt-get install lib32z1-dev
3. "mkimage" command not found-U-Boot images will not be built
# apt-get install u-boot-tools
4. / usr/bin/ld: cannot find-lncurses
# apt-get install libncurses5-dev
5 、 makeinfo: command not found
# apt-get install texinfo
Finally, I mentioned the SVP_PC in the Hi3519AV100 SDK package.
1) HiIVE_PC_Vx.x.x.x.rar
IVE PC side component package, including / tool directory, document "HiIVE tool usage Guide .pdf"
2) HiSVP_PC_Vx.x.x.x.rar
SVP PC side component package, including include/lib/sample/tool
3) HiDPU_PC_Vx.x.x.x.rar
DPU PC side component package, including tool
The above is the company's own Hi3519AV100 core board and development board, Taobao above to sell, enter Hi3519AV100 should be able to see, or link: https://shop472233692.taobao.com/?spm=2013.1.0.0.304c52653E7qVS. There are also my contact information QQ: 2505133162 week 1 Murray-Saturday online. In short, it takes some strength to do this Hi3519AV100, and it takes some skills to deeply learn the development of some classical Caffe algorithms on PC. Figure-3, figure-4 and figure-5 NNIE development guide, involving many knowledge points. Hi3519AV100 mentioned above that Hi3519AV100 can not only do human face recognition, but also do gesture recognition products, I also hope that more algorithm software companies based on these embedded AI platforms, at the edge to make more recognition algorithms to serve the common people, such as the existing anti-human robbery and trafficking of children, in cooperation with face recognition, gesture recognition, behavior analysis and cloud processing, so that criminals can hide without place, and so on. The more the future, the more powerful the embedded AI chip, such as 1W power consumption of 10.0T computing power will soon come out.
These days (2019-05-25) the trade war is very hot, which is very damaging to us technology companies, and survival is the biggest goal in the short term. Fortunately, Hayth got the permanent authorization of ARMV7 and ARMV8 of ARM, and TSMC also supported these big customers, which should have no impact on Hi3519AV100. Of course, our company not only uses Hayes chip to design embedded AI products, but also takes Intel Movidius chip to do products, and then will do other platform products. According to our own strength, which one is easy to use, the data is easy to get, and we will choose who has the best performance-to-price ratio. From the above introduction, Hi3519AV100 performance-to-price ratio is really good.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.