Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

A line of words to achieve 3D face change, UC Berkeley put forward "Chat-NeRF", say a word to complete the blockbuster rendering

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

All it takes is one line of black tech! UC Berkeley proposed Instruct-NeRF2NeRF, one-click image editing advanced one-click 3D scene editing.

Due to advances in neural 3D reconstruction techniques, capturing feature representations of real-world 3D scenes has never been easier.

However, the 3D scene editing above this has not been able to have a simple and effective solution.

Recently, researchers from UC Berkeley have proposed a method for editing NeRF scenes using text instructions-Instruct-NeRF2NeRF-based on previous work InstructPix2Pix.

Paper address: arxiv.org/ abs / 2303.12789 With Instruct-NeRF2NeRF, we can edit large-scale real-world scenarios in a single sentence, and they are more realistic and targeted than previous work.

For example, if he wanted to have a beard, a tuft of beard would appear on his face!

Or you could just switch heads and turn seconds into Einstein.

In addition, since the model can continuously update the dataset with new edited images, the reconstruction of the scene will gradually improve.

NeRF + InstructPix2Pix = Instruct-NeRF 2 NeRF Specifically, a human needs to be given an input image and written instructions telling the model what to do, and then the model will follow these instructions to edit the image.

The implementation steps are as follows:

An image is rendered from the scene at the training viewpoint.

Use InstructPix2Pix model to edit the image according to global text instructions.

Replace the original images in the training dataset with the edited images.

The NeRF model continues to train as usual.

Compared with traditional 3D editing, NeRF2NeRF is a new 3D scene editing method, and its biggest highlight lies in the use of "iterative dataset update" technology.

Although editing is done on 3D scenes, 2D rather than 3D diffusion models are used in the paper to extract form and appearance priors because the data used to train 3D generative models is very limited.

The 2D diffusion model is InstructPix2Pix, a text-based 2D image editing model recently developed by the research team. Input images and text instructions, and it can output edited images.

However, this 2D model results in uneven changes in different angles of the scene, so "iterative dataset update" is born, which alternately modifies NeRF's "input picture dataset" and updates the underlying 3D representation.

This means that the text-guided diffusion model (InstructPix2Pix) will generate new image variations according to the instructions and use these new images as inputs for NeRF model training. Therefore, the reconstructed 3D scene will be edited based on the new text guide.

InstructPix2Pix generally cannot perform consistent edits at different perspectives in the initial iteration, however, they converge to a globally consistent scene during NeRF rerendering and updating.

In summary, NeRF2NeRF improves the editing efficiency of 3D scenes by iteratively updating image content and integrating these updated content into the 3D scene, while maintaining the continuity and realism of the scene.

It can be said that this work of UC Berkeley's research team is an extension of the previous InstructPix2Pix. By combining NeRF with InstructPix2Pix and cooperating with "iterative dataset update", one-click editing can still play 3D scenes!

There are still limitations, but the flaws do not obscure the advantages. However, since Instruct-NeRF2NeRF is based on the previous InstructPix2Pix, it inherits many limitations of the latter, such as the inability to perform large-scale spatial operations.

Also, like DreamFusion, Instruct-NeRF2NeRF can only use diffusion models on one view at a time, so similar artifact issues may be encountered.

The figure below shows two types of failures:

(1) Pix2Pix cannot perform editing in 2D, so NeRF 2 NeRF fails in 3D;

(2) Pix2Pix can be edited in 2D, but there is a large inconsistency in 3D, so NeRF 2 NeRF also failed.

For example, the "panda" below not only looks very fierce (the statue as a prototype is very fierce), but also the hair color is somewhat strange, and the eyes also have obvious "through mold" when moving in the picture.

Since ChatGPT, Diffusion, NeRFs were pulled into the spotlight, this article can be said to give full play to the advantages of the three, from "AI one-sentence drawing" to "AI one-sentence editing 3D scene".

Although the method has some limitations, it still has some shortcomings. It provides a simple and feasible scheme for 3D feature editing, which is expected to be a milestone in the development of NeRF.

At the end of editing the 3D scene in one sentence, look at the effects released by the author.

It is not difficult to see that this one-click PS 3D scene editing artifact, whether it is the command understanding ability, or the image reality, is more in line with expectations, the future may become the academic and netizens play "new favorite", following ChatGPT to create a Chat-NeRFs.

Even if you randomly change the background of the image, the characteristics of the four seasons, and the weather, the new image is completely consistent with the logic of reality.

Original:

Autumn:

Xue Tian:

Desert:

Storm:

References:

https://instruct-nerf2nerf.github.io

This article comes from Weixin Official Accounts: Xinzhiyuan (ID: AI_era)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report