In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >
Share
Shulou(Shulou.com)11/24 Report--
Original title: "UCSD, MIT and other Chinese teams teach robot dogs to perceive the 3D world!" Using M1 chip, climbing buildings and crossing obstacles can do anything | CVPR 2023 "
Have you ever seen thinking with an Apple M1 chip and walking your own robot dog?
Recently, researchers from UCSD, IAIFI and MIT institutions used a new neural volume memory architecture (NVM) to teach a robot dog to perceive the three-dimensional world.
With this technology, robot dogs can climb stairs, cross cracks, turn over obstacles and so on through a single neural network-completely autonomous without remote control.
Have you noticed the white box on the dog's back?
It contains Apple's M1 chip, which is responsible for the visual processing task of running robot dogs. Moreover, it was removed by the team from a Mac.
It is not difficult to see that MIT's robot dog can easily climb across a branch in front of him with no effort (basically).
MacBook with four legs? It is well known that it is difficult for robot dogs and other robots with legs to cross uneven roads.
The more complex the road conditions, the more obstacles cannot be seen.
In order to solve the problem of "partially observable environment", the current vision-motion technology of SOTA connects image channels through frame stacking (frame-stacking).
However, this simple processing method lags far behind today's computer vision technology, which can explicitly simulate optical flow and specific 3D geometry.
Inspired by this, the team proposed a neural volume memory architecture (NVM), which can fully take into account the three-dimensional world of SE (3) and other degeneration (Equivalence).
Project address: https://rchalyang.github.io/ NVM/ is different from previous methods, NVM is a volume format. It can aggregate the feature volumes from multiple camera views into the self-centered framework of the robot, so that the robot can better understand the surrounding environment.
The test results show that after training the leg movement with neural volume memory (NVM), the performance of the robot on complex terrain is significantly better than that of the previous technique.
In addition, the results of ablation experiments show that the content stored in the neural volume memory captures enough geometric information to reconstruct the 3D scene.
Experiments in the real world in order to verify in different real-world scenarios other than simulation, the team carried out experiments in both indoor and outdoor scenes.
When the robot dog finds that an obstacle suddenly appears in front of him, it will directly choose to bypass it.
There seems to be no problem walking on the rocky ground, though it is more laborious than on the flat ground.
Compared with their own relatively big obstacles, hard work can still be turned over.
Using the previous identification and control technology, the puppy's hind legs made an obvious error in judging the distance, and one foot overturned in the ditch and failed.
After using the NVM proposed by MIT, the puppy crosses the ditch, steady happiness and success!
Using the previous identification and control technology, the puppy's first foot missed, the dog's head grabbed the ground and failed.
After adopting the NVM proposed by MIT, the puppy walked smoothly through the matrix.
The volume memory of leg movement uses a self-centered camera perspective, which is essentially a problem of dealing with the "partially observable environment" (Partially-Observed).
In order to materialize the control problem, the robot needs to collect information from previous frames and correctly infer the occluded terrain.
In the process of movement, the position of the camera installed directly on the chassis of the robot changes violently and suddenly.
In this way, it becomes very important that a single frame can be placed in the right place in the process of representing a series of pictures.
For this reason, the team proposed the concept of neural volume memory (NVM), which can transform a series of input visual information into scene features for 3D description, and then output.
Although the "behavioral cloning goal" of NVM through self-supervised learning is enough to produce a good strategy, it automatically provides an independent and self-supervised learning goal for neural volume memory according to the equivariability of translation and rotation.
Self-supervised learning: the research team trained an independent decoder. Let it predict the visual observation in different frames through a visual observation and the prediction conversion between two frames.
As shown in the image above, you can assume that the 3D scene around between frames remains the same. Because the camera is looking forward, we can normalize the feature volume of the previous frame and use it to predict subsequent images.
The visual reconstruction of the decoder the first image shows the robot moving in the environment, the second image is the input visual observation, and the third image is the visual observation effect synthesized by 3D feature volume and estimated picture.
For the visual observation of the input, the research team applied a lot of data enhancement to the image to improve the robustness of the model.
The author introduces Ruihan Yan
Ruihan Yan is a second-year doctoral student at the University of California, San Diego. Prior to that, he received a bachelor's degree in software engineering from Nankai University in 2019.
His research interests are reinforcement learning, machine learning, robots and so on. Specifically, he wants to build an agent that uses information from different sources to make decisions.
Ge Yang
Ge Yang graduated with a bachelor's degree in physics and mathematics from Yale University and a doctorate in physics from the University of Chicago. He is currently a postdoctoral fellow at the National Science Foundation Institute of artificial Intelligence and basic interaction (IAIFI).
The study of Ge Yang involves two groups of related problems. The first group is to improve learning by re-examining the way we represent knowledge in neural networks and how knowledge is transferred in distribution. The second group is to look at reinforcement learning from the perspective of theoretical tools, such as neural tangent nucleus, non-Euclidean geometry and Hamiltonian dynamics.
Xiaolong Wang
Xiaolong Wang is an assistant professor in the ECE department at the University of California, San Diego. He is a member of the robotics team at the Institute of artificial Intelligence of the TILOS National Science Foundation.
He earned a doctorate in robotics from Carnegie Mellon University and did postdoctoral research at the University of California, Berkeley.
Reference:
Https://rchalyang.github.io/NVM/
This article comes from the official account of Wechat: Xin Zhiyuan (ID:AI_era)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.