In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >
Share
Shulou(Shulou.com)11/24 Report--
Can you reproduce the world in your eyes only through eyeball reflection? Such a sci-fi scene has become a reality in this paper. More coincidentally, on the same day, the sixth season of Black Mirror was released.
"the only real journey of exploration is not to visit strange lands, but to observe the universe through other people's eyes. "- Marcel Proust
To see the world through other people's eyes, this science fiction and poetic (and scary) idea has come true!
The first season of the Black Mirror, "your whole History." now, we only need to use the reflection of our eyes to reconstruct the object that this person is observing in three dimensions.
Yes, it's a dark mirror.
Recently, a team from the University of Maryland proposed an entirely new approach to 3D reconstruction of scenes that were not captured by the camera by using portraits with reflective eyes.
Paper address: https://arxiv.org/ abs / 2306.09348
Project address: https://world-from-eyes.github.io/
All the scenes in classic science fiction come true? Reconstruct the radiation field with eye reflection? This idea seems crazy, but in fact it has sufficient theoretical basis.
According to the author, because the human eye is highly reflective, it is possible to use only the reflection of the eyes to reconstruct and render the 3D scene that people are observing from a series of frames that capture head movement.
Given that the concept is very "black mirror" and that the new season of Black Mirror was announced just hours after the paper was published, it makes people wonder if the director of Black Mirror has also noticed the paper. (dog head)
Black Mirror season 6 is now online. As soon as the study was released, netizens exploded directly.
So, we've come this far?
Isn't this the scene from the Shell attack Mobile team in the 2000s? All these fictions have come true!
100% of Blade Runner, give me a copy now.
Jules Verne's Kip Brothers has come true!
Of course, some people say it's creepy: this technology must not be used to investigate and collect evidence and so on.
Today, we already have the Varjo eye tracking camera, as well as Apple's VisionPro and other head displays, these devices can capture a lot of lens material, combined with this new technology, countless new sci-fi scenes, I am afraid will soon come true.
By using the tiny reflection of light on the human eye, the team developed a way to reconstruct an observed (non-direct) scene using a sequence of monocular images taken at a fixed camera position.
However, it is not enough to train the radiation field on the observed reflection for several reasons: 1) the inherent noise in corneal localization, 2) the complexity of iris texture, and 3) the low resolution reflection captured in each image.
In order to solve these challenges, the team introduces corneal posture optimization and iris texture decomposition in the training process, and regularizes the loss of radial texture based on human iris.
Different from the traditional neural field training methods that need to move the camera, they use the method to place the camera on a fixed point of view, which is completely dependent on the movement of the user.
Using human eye reflection to realize scene reconstruction because it is very difficult to accurately estimate the posture of the eye and the texture between the iris and the scene reflection are intertwined, so this task is quite challenging.
In order to solve this problem, the author makes a joint optimization for the eye posture, the radiation field describing the scene and the iris texture of the observer.
Specifically, there are three main contributions:
1. New 3D reconstruction
In this paper, a new method to reconstruct the 3D scene of the observer world from eye images is proposed, which can combine the previous basic work with the latest progress in neural rendering.
two。 Radial prior of iris
The radial prior of iris texture decomposition is introduced, which significantly improves the quality of the reconstructed radiation field.
3. Optimization of corneal posture
A process of corneal posture optimization is developed to reduce the noise of eye posture estimation and overcome the unique challenge of extracting features from human eyes.
The results show that by using this new method, we can move the picture and obtain multiple perspectives of the scene from the reflection of the eyes, and finally achieve a complete scene reconstruction.
To make matters worse, the team also tried to use Miley Cyrus and Lady Gaga's MV to reconstruct the scene in their eyes.
The authors say they successfully reconstructed the object that appeared in Miley's eyes and seemed to see a person's upper body in Lady Gaga's eyes.
However, because the quality of these videos is not high enough, the accuracy of the reconstruction results cannot be determined.
Lady Gaga
How did Miley Cyrus do it? It is well known that the corneal geometry of healthy adults is almost the same.
Therefore, as long as the pixel size of a person's cornea is calculated in the image, their eye position can be accurately calculated.
Next, the author trains the radiation field reflected by the eyes by taking light from the camera and reflecting them into approximate eye geometry.
In order to avoid iris reconstruction in human eyes, the author also trains a two-dimensional texture mapping to learn iris texture for texture decomposition.
Experimental evaluation synthetic data evaluation firstly, the author evaluates the synthetic data by placing a human eye model in the Blender scene.
The following image shows a scene reconstructed using only eye reflection.
Since the cornea can not be estimated perfectly in real life, the author evaluates the robustness of corneal posture optimization to estimate corneal radius noise.
In order to simulate the depth estimation errors that may be encountered in the real data, the author uses different noise levels to scale the corneal radius observed in each image to destroy the observed corneal radius r_img.
The following figure shows the performance changes at different noise levels.
It is worth noting that with the increase of noise, compared with the reconstruction without attitude optimization, the attitude optimization reconstruction proposed by the author is more robust in terms of geometry and color.
This proves that pose optimization is very important for real scenes, because the quasi-merging of the projected cornea to the initial ellipse in the image is not perfect.
In addition, the quantitative comparison with or without texture decomposition shows that the author's method performs better in the case of texture decomposition in SSIM and LPIPS.
It is worth noting that the author does not calculate the PSNR because the lighting difference between the reflection and the scene itself is very large in the settings.
In order to ensure the reality of the field of vision, the author chose Sony RX IV camera to shoot, and used Adobe Lightroom to post-process the image in order to reduce the noise in corneal reflection. At the same time, the author adds a light source on both sides of the character to illuminate the target object.
During the process, the person being photographed needs to move within the field of view of the camera so that the team can take 5-15 images in each scene.
Because the scene lighting has a large dynamic range, the author uses 16-bit images in all experiments to avoid losing the information in the observed reflection.
On average, the cornea covers only about 0.1% of the area in each image, while the target object accounts for about 20x20 pixels and interlaces with the iris texture.
First of all, the author estimates the initial position of the cornea by estimating the center and radius of the cornea.
Then, the average depth is directly approximated to the focal length of the camera to calculate the three-dimensional position of the cornea and its surface normal.
In order to automate this process, the author uses Grounding Dino to locate the bounding box of the eye and uses ELLSeg to fit the iris ellipse.
Although the cornea is usually occluded, we only need unobscured areas, so we can use Segment Anything to obtain the iris segmentation mask.
The real results show that the author's method can reconstruct 3D scenes from real-world portraits, despite the inaccuracy of corneal position and geometric estimates.
Because of the fuzziness of the corneal boundary, it is very difficult to achieve accurate location in the image.
In addition, 3D reconstruction is more difficult for some eye colors, such as green and blue, because the iris texture is brighter.
In addition, when there is no clear modeling texture, there will be more "floating objects" in the reconstructed picture.
In order to solve these problems, the quality of reconstruction can be improved by increasing the degree of radial regularization.
However, there are still two main limitations to this approach.
First of all, the current real-world results are based on "laboratory settings", such as magnifying the human face and illuminating the scene with extra light sources. In a freer environment, we need to face greater challenges such as low sensor resolution, small dynamic range and motion blur.
Secondly, the current assumptions about iris texture (such as constant texture, constant radial color) may be oversimplified, so this method may fail when the eye rotates substantially.
The author introduces Kevin Zhang, a co-author and is currently a doctoral student at the University of Maryland.
Brandon Y. Feng holds a doctorate in computer science from the University of Maryland, with research interests focused on computational imaging, mid-level vision, and computational photography. Machine learning algorithms have been developed for image and 3D data processing, ranging from mixed reality to natural sciences.
Jia-Bin Huang is an associate professor at the University of Maryland and previously received his doctorate from UIUC. Research interests focus on the intersecting fields of computer vision, computer graphics and machine learning.
Reference:
Https://world-from-eyes.github.io
This article comes from the official account of Wechat: Xin Zhiyuan (ID:AI_era)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.