Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is NanoDet?

2025-02-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces what NanoDet is, the article is very detailed, has a certain reference value, interested friends must read it!

NanoDet is an ultra-fast and lightweight mobile Anchor-free target detection model. Preface

YOLO, SSD, Fast R-CNN and other models have high speed and precision in target detection, but these models are relatively large and not suitable for transplanting to mobile or embedded devices; lightweight model NanoDet-m, the three modules of single-stage detection model (Head, Neck, Backbone) are lightweight, the speed of target detection is very fast, and the size of the model file is only a few megabytes (less than 4m).

NanoDet author open source code address: https://github.com/RangiLyu/nanodet (salute)

Small tailoring based on NanoDet project, specially used to implement Python language, PyTorch version of the code address: https://github.com/guo-pu/NanoDet-PyTorch

Download can be used directly, support pictures, video files, camera real-time target detection

First, take a look at the effect of NanoDet target detection:

Detect multiple cars at the same time:

View the detection results of multiple targets, overlapping targets, and both small and large targets:

Introduction of NanoDet Model

NanoDet is a FCOS-style single-stage anchor-free target detection model, which uses ATSS for target sampling and Generalized Focal Loss loss function to perform classification and border regression (box regression).

1) NanoDet model performance

NanoDet-m model is compared with YoloV3-Tiny and YoloV4-Tiny.

Note: the above performance is based on ncnn and Kirin 980 (4xA76+4xA55) ARM CPU. Using COCO mAP as the evaluation index, taking into account the accuracy of detection and positioning, tested on 5000 COCO val images, and did not use Testing-Time-Augmentation.

The author of NanoDet deployed ncnn to the mobile phone (4 A76 cores and 4 A55 cores of CPU Kirin 980 based on ARM architecture) and ran benchmark. The forward computing time of the model is only about 10 milliseconds, while yolov3 and v4 tiny are both in the order of 30 milliseconds. On the Android camera demo app, NanoDet can easily run to 40+FPS, taking into account the time spent on image preprocessing, post-processing of the test box, and drawing the test box.

2) NanoDet model architecture

3) NanoDet loss function

NanoDet uses the Generalized Focal Loss loss function proposed by Li Xiang et al. This function can remove the Centerness branch of FCOS and save a lot of convolution on this branch, thus reducing the computational overhead of the detection head, which is very suitable for lightweight deployment of the mobile end.

For more information, please refer to Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection

4) NanoDet advantage

NanoDet is an ultra-fast and lightweight mobile Anchor-free target detection model. The model has the following advantages:

Ultra lightweight: model file size is only a few megabytes (less than 4M--nanodet_m.pth)

Ultra-fast: 97fps (10.23ms) on mobile ARM CPU

Training-friendly: the memory cost of GPU is much lower than other models. When the Batch-size on GTX1060 6G is 80, it can run.

Easy to deploy: C++ implementation and Android demo based on ncnn reasoning framework are provided.

Implementation of NanoDet based on PyTorch

Small tailoring based on NanoDet project, specially used to implement the code address of Python language and PyTorch version:

1) effect of NanoDet target detection

Four teenagers were detected at the same time.

In complex streets, pedestrians and cars are detected:

Through the test, it is found that NanoDet is indeed very fast, but the recognition accuracy and effect are much worse than YOLOv4.

2) Environmental parameters

Test environment parameters

System: Windows programming language: Python 3.8.Integrated development environment: Anaconda

Deep learning framework: PyTorch2.7.0+cu101 (torch > = 1.3) development code IDE:PyCharm

The specific development environment requirements are as follows:

Cython

Termcolor

Numpy

Torch > = 1.3

Torchvision

Tensorboard

Pycocotools

Matplotlib

Pyaml

Opencv-python

Tqdm

Generally speaking, it is relatively difficult to install GPU acceleration (graphics card driver, cudatoolkit, cudnn), PyTorch and pycocotools in testing.

For installation of Windows development environment, please refer to:

For installation of cudatoolkit 10.1and cudnn7.6, please refer to https://blog.csdn.net/qq_41204464/article/details/108807165

To install PyTorch, please refer to https://blog.csdn.net/u014723479/article/details/103001861

To install pycocotools, please refer to https://blog.csdn.net/weixin_41166529/article/details/109997105

3) experience NanoDet target detection

Download the code and open the project

First go to githug to download the code, then extract the project, and then use the PyCharm tool to open the project.

Githug code download address: https://github.com/guo-pu/NanoDet-PyTorch

Note: this code is based on the NanoDet project for small tailoring, specifically used to implement the Python language, PyTorch version of the code

NanoDet author open source code address: https://github.com/RangiLyu/nanodet (salute)

Open the project using the PyCharm tool

Select development environment]

File (file)-- > Settings (setting)-- > Project (Project)-- > Development Environment selected by Project Interpreters

Then click Apply first, wait for the load to complete, and then click OK

Carry out target detection

For specific commands, please refer to:

'' target detection-picture''python detect_main.py image-- config. / config/nanodet-m.yml-- model model/nanodet_m.pth-- path street.png' target detection-video file''python detect_main.py video-- config. / config/nanodet-m.yml-- model model/nanodet_m.pth-- path test.mp4' target detection-camera 'python detect_main.py webcam-- -config. / config/nanodet-m.yml-- model model/nanodet_m.pth-- path 0

[target detection-picture]

[target detection-video file]

The test is a picture from 1080 to 1920, which is very smooth and does not stutter, but the recognition accuracy is not too high at present.

4) call the core code of the model

Detect_main.py Code:

Import cv2import osimport timeimport torchimport argparsefrom nanodet.util import cfg, load_config, Loggerfrom nanodet.model.arch import build_modelfrom nanodet.util import load_model_weightfrom nanodet.data.transform import Pipeline image_ext = ['.jpg', '.jpeg', '.webp', '.bmp', '.png'] video_ext = ['mp4',' mov', 'avi' 'mkv']' Target Detection-Picture''# python detect_main.py image-config. / config/nanodet-m.yml-- model model/nanodet_m.pth-- path street.png 'Target Detection-Video File' # python detect_main.py video-config. / config/nanodet-m.yml-- model model/nanodet_m.pth-- path test.mp4 'Target Detection-camera'' # python detect_main.py webcam-- config. / config/nanodet-m.yml-- model model/nanodet_m.pth-- path 0 def parse_args (): parser = argparse.ArgumentParser () parser.add_argument ('demo' Default='image', help='demo type, eg. Image, video and webcam') parser.add_argument ('- config', help='model config file path') parser.add_argument ('--model', help='model file path') parser.add_argument ('--path', default='./demo', help='path to images or video') parser.add_argument ('--camid', type=int, default=0 Help='webcam demo camera id') args = parser.parse_args () return args class Predictor (object): def _ init__ (self, cfg, model_path, logger, device='cuda:0'): self.cfg = cfg self.device = device model = build_model (cfg.model) ckpt = torch.load (model_path, map_location=lambda storage, loc: storage) load_model_weight (model, ckpt Logger) self.model = model.to (device). Eval () self.pipeline = Pipeline (cfg.data.val.pipeline, cfg.data.val.keep_ratio) def inference (self, img): img_info = {} if isinstance (img) Str): img_info ['file_name'] = os.path.basename (img) img = cv2.imread (img) else: img_info [' file_name'] = None height, width = img.shape [: 2] img_info ['height'] = height img_info [' width'] = width meta = dict (img_info=img_info Raw_img=img, img=img) meta = self.pipeline (meta, self.cfg.data.val.input_size) meta ['img'] = torch.from_numpy (meta [' img']. Transpose (2,0,1)) .unsqueeze (0) .to (self.device) with torch.no_grad (): results = self.model.inference (meta) return meta Results def visualize (self, dets, meta, class_names, score_thres, wait=0): time1 = time.time () self.model.head.show_result (meta ['raw_img'], dets, class_names, score_thres=score_thres, show=True) print (' viz time: {: .3f} s'.format (time.time ()-time1)) def get_image_list (path): image_names = [] for maindir, subdir File_name_list in os.walk (path): for filename in file_name_list: apath = os.path.join (maindir Filename) ext = os.path.splitext (apath) [1] if ext in image_ext: image_names.append (apath) return image_names def main (): args = parse_args () torch.backends.cudnn.enabled = True torch.backends.cudnn.benchmark = True load_config (cfg, args.config) logger = Logger (- 1, use_tensorboard=False) predictor = Predictor (cfg, args.model, logger Device='cuda:0') logger.log ('Press "Esc", "Q" or "Q" to exit.') If args.demo = 'image': if os.path.isdir (args.path): files = get_image_list (args.path) else: files = [args.path] files.sort () for image_name in files: meta, res = predictor.inference (image_name) predictor.visualize (res, meta, cfg.class_names 0.35) ch = cv2.waitKey (0) if ch = = 27 or ch = = ord ('q') or ch = = ord ('Q'): break elif args.demo = 'video' or args.demo = =' webcam': cap = cv2.VideoCapture (args.path if args.demo = = 'video' else args.camid) while True: ret_val, frame = cap.read () meta Res = predictor.inference (frame) predictor.visualize (res, meta, cfg.class_names, 0.35) ch = cv2.waitKey (1) if ch = = 27 or ch = = ord ('q') or ch = = ord ('Q'): break if _ _ name__ = ='_ main__': main () above are all the contents of the article "what is NanoDet?" Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report