Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Elephant P turned around and used it right out of the box. HKU, NTU and Tsinghua University were the first to open source "replica" version of DragGAN.

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

Here comes the unofficial implementation of DragGAN! Perfect copy drag second P map function, you can try it directly.

Remember the DragGAN released a few days ago?

Yes, it is the tool that "click twice" to fix the picture in one second.

Did you take a bad picture? Fix it! The face is not thin enough? Fix it! The face is facing the camera at the wrong angle? Fix it!

Perhaps, the ancient PS joke "turn the Elephant around" may be about to come true. As soon as the demo video of the AI mapping tool was released, it immediately became a huge hit at home and abroad.

Many netizens shouted, "PS no longer exists."

Within a few days, the unofficial implementation of DragGAN was ready for trial. This function has been integrated into InternGPT, and the interface looks like ↓.

Experience address: https://igpt.opengvlab.com/ did not expect that as soon as the demo entrance was opened, it was directly crowded.

Official demonstration from the official demo video, the reproduction of the DragGAN effect is amazing.

Grin

First, how to make a person who doesn't laugh. Just select two corners of the mouth and Drag directly.

As you can see, the final result has no sense of disobedience. Because the facial muscles are also changing together, not simply grinning.

Close your mouth.

Face editor

This face-slimming function is too familiar to everyone. If you choose two faces to squeeze in, the output is very natural.

The male has a thin face. But this is a little too thin, the output result is false, the chin is too sharp.

This has to be pushed! Hair transplant! Good news for how many bald people.

However, from the output results, even if the forehead is selected, the hair grows in equal proportion in all places, and the final result is a bit like the Monkey King.

Turn your face

The rotation of the face is also a very practical function, and the filling part is very natural.

In addition to a small range of image retouching, InternGPT itself has many other eye-catching operations that can be carried out.

Remove masked object

Click on the part of the picture you want to operate on, and type "remove" in prompt.

Image generation

This function is interesting, first upload a picture, type prompt to let DragGAN split, and then enter a prompt to generate the desired picture.

Did you show your black feet? (no)

Video highlight commentary

You can also edit videos with one click with prompt.

Interactive visual question and answer

Even after identifying the information on the picture, you can also query it directly on the Internet.

Interactive image generation

Any graffiti can be turned into a Meitu with one click.

Anyway, after reading these functions, the editor is really shocked. All the functions highlight two features: "stupid operation, and extremely easy to use".

Who can not love this?

The technology implementation has seen so many cool features, so what on earth is this InternGPT?

InternGPT (iGPT) / InternChat (iChat) is a visual interaction system driven by pointing language. Users can interact with ChatGPT by clicking, dragging and drawing.

Different from the existing interactive systems that rely on pure language, by integrating pointing instructions, iGPT significantly improves the efficiency of communication between users and chatbots, as well as the accuracy of chatbots in vision-centric tasks, especially in complex visual scenes.

Paper address: https://arxiv.org/ pdf / 2305.05662.pdf the following figure is the overall architecture of InternGPT.

We can see that this GPT can process not only images and videos, but also voice and text.

For image or video input, InternGPT will use SAM (Image Segmentation Model), OCR (Image recognition Model) and so on.

After identifying geographical locations, objects, or lines, there is a whole toolbox for further processing, all of which are familiar to us.

Such as BLIP (audio), Stable Diffusion (image), Pix2Pix (image translation) and so on.

Similarly, for text or voice input, InternGPT will call GPT-4, LLaMA and other models or tools for processing, followed by a full toolkit.

The overall architecture of InternGPT

Using tips, the whole process is also very convenient in the process of use.

After the image is uploaded successfully, users can send the following message to iGPT for a multi-modal conversation:

"what is it in the image?" Or "what is the background color of image?". Similarly, users can interactively manipulate, edit, or generate pictures as follows:

Click anywhere on the picture, and then press the Pick button to preview the split area. You can also press the OCR button to identify all the words that exist in a specific location

To delete the mask area in the image, you can send the following message:

"remove the masked region" to replace the masked object in the image with another object, you can send the following message:

"replace the masked region with {your prompt}" to generate a new image, you can send the following message:

"generate a new image based on its segmentation describing {your prompt}" wants to create a new image by doodling, press Whiteboard and draw on the whiteboard. When the drawing is complete, you need to press the Save button and send the following message:

"generate a new image based on this scribble describing {your prompt}" netizens commented that there is now an unofficial version of the shocking DragGAN. The official version will be released in June, which is just a preview of the future.

DragGAN has been integrated into InternGPT, come out so soon, repair artifact.

Reference:

Https://igpt.opengvlab.com/

This article comes from the official account of Wechat: Xin Zhiyuan (ID:AI_era)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report