In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-10 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >
Share
Shulou(Shulou.com)11/24 Report--
Thank you, Mr. Air, a netizen of CTOnews.com, for your clue delivery! CTOnews.com March 14 news, Shangtang Technology today released a multimodal multi-task general model "INTERN) 2.5", with 3 billion parameters, claimed to be the world's open source model of ImageNet accuracy, the largest, but also the only object detection benchmarking data set COCO model more than 65.0 mAP model.
According to reports, the picture and text cross-modal open task processing capability of "Scholar 2.5" can provide efficient and accurate perception and understanding support for self-driving, robots and other general scene tasks. "Scholar" was jointly released by Shangtang Technology, Shanghai artificial Intelligence Laboratory, Tsinghua University, Chinese University of Hong Kong and Shanghai Jiaotong University in November 2021, and continued joint research and development.
In terms of improvement, "Scholar 2.5" defines tasks through text, so that it can flexibly define the task requirements of different scenes, and give corresponding instructions or answers according to the suggestive statements of given visual images and tasks. then it has the ability of advanced perception and complex problem processing in general scenes, such as image description, visual question answering, visual reasoning and text recognition.
In common scenarios such as autopilot and home robots, "Scholar 2.5" can assist in a variety of complex tasks.
For example, in the self-driving scene, it can greatly improve the scene perception and understanding ability, accurately assist vehicles to judge the status of traffic lights, road signs and other information, and provide effective information input for vehicle decision planning.
▲ uses multi-mode and multi-task general large model to complete all kinds of complex tasks in autopilot scene.
▲ uses multi-mode and multi-task general large model to assist in accomplishing all kinds of complex tasks in home robot scene. In addition to the ability to solve complex problems such as autopilot and home robot, the "scholar 2.5" general large model can also solve complicated common tasks in daily life and meet various needs.
In addition to the full picture level to create text, "Scholar 2.5" general model can also be more refined according to the object frame to locate the task requirements.
"Scholar 2. 5" also has the ability of AIGC "to create pictures". According to the text creation requirements put forward by users, the diffusion model generation algorithm can be used to generate high-quality and natural realistic images.
For example, with the help of "Scholar 2.5" to help self-driving technology research and development, by generating all kinds of real road traffic scenes, such as busy urban streets, crowded lanes on rainy days, dogs running on the road, etc., generate realistic Corner Case training data, and then train the upper limit of the perception ability of the autopilot system to the Corner Case scene.
"Scholar 2.5" can also quickly retrieve visual content based on the text.
For example, the relevant image specified by the text can be returned in the album, or the frame most relevant to the text description can be retrieved in the video to improve the efficiency of the time positioning task in the video. In addition, it also supports the introduction of object detection box to return the most relevant objects according to the text, so as to realize object detection and visual location in open world video or image.
From now on, the "Scholar 2.5" multimodal general large model has been open source in the general visual open source platform OpenGVLab, which Shangtang participates in, and CTOnews.com is attached with a link to GitHub warehouse access.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.