Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Generative AI is popular: Silicon Valley bets you to turn simple text into images or even videos

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

The so-called "generative artificial intelligence (generative AI)" that has sprung up in recent years is attracting interest from Silicon Valley technology giants and venture capital firms, which can generate matching images in seconds based on a small number of words, Oct. 9. Analysts expect the technology to be widely used in a variety of industries and generate trillions of dollars in economic value.

Although the images generated by these computer programs are not perfect, such as extra fingers on the hands, unnatural bends of the limbs, etc. At the same time, the image generator will encounter problems when dealing with text, such as generating meaningless symbols and so on. However, these image generation programs may be the beginning of a technology boom. "in the past three months, the word 'generative artificial intelligence' has become a buzzword," said David Beisel, an investor at Silicon Valley venture capital firm NextView Ventures.

Since 2021, generative AI technology has made great progress, even encouraging many people to quit their jobs to start new companies, dreaming that AI can power a new generation of technology giants in the future.

The AI field has been booming over the past five years or so, but most of these advances have to do with understanding the data available. The AI model has become efficient enough to tell if there is a cat in a photo that people have just taken with their mobile phone. In addition, these models are reliable enough to provide billions of search results for Google's search engine every day. However, the generative AI model can generate something completely new that didn't exist before. In other words, they are creating, not just analyzing data.

"the most impressive thing is that generative AI can also create new things," said Boris Dayma, founder of AI and Craiyon Productive AI, a machine learning platform. "they can not only create similar old images, but also create new things that are completely different from before."

Sequoia Capital, a prominent Silicon Valley venture capital firm, wrote on its website: "from games to advertising to legal aspects, generative AI may change all areas where human creativity is needed. This technology has the potential to generate trillions of dollars of economic value." More interestingly, Sequoia Capital also pointed out in his post that part of the above article was written by GPT-3, which itself is a generative AI that can generate text.

How generative AI works the techniques used in image generation come from a subset of machine learning called deep learning. Deep learning has driven most of the progress in the field of AI since 2012, when a landmark paper on image classification rekindled interest in the technology. In-depth learning uses the model trained on the big data set until the program understands the relationships in the data. The model can then be used in applications, such as identifying whether there is a dog in a picture or translating text.

The image generator works by reversing this process. Instead of translating English into French, they convert English phrases into images. They usually have two main parts, one is the part that processes the initial phrase, and the other is the part that converts the data into an image.

The first part of generative AI is based on a method called Generative Adversarial Networks (generative countermeasure network, GAN). Previously, these GAN were often used to generate photos of people who did not exist. In essence, the way they work is to make the two AI models compete with each other to better create images that meet their intended goals.

Newer methods usually use converters, a concept first proposed by Google in its 2017 paper. This is an emerging technology that can take advantage of larger data sets, although its training costs can run into millions of dollars.

The first image generator to get a lot of attention was Dall-E, a project launched by Silicon Valley startup OpenAI in 2021. OpenAI released a more powerful update this year. "with Dall-E 2, this is really a moment for us to cross the Valley of Terror effect (Uncanny Valley)," said Christian Cantrell, a developer who specializes in generative AI.

Another commonly used AI-based image generator is Craiyon, formerly known as Dall-E Mini, which can be bought online. After the user enters the phrase, the drawing can be seen in the browser within a few minutes.

According to Daima, founder of AI and machine learning platform Craiyon Productive AI, since its launch in July 2021, Craiyon now generates about 10 million images a day, generating a total of 1 billion images that have never been seen before. After a surge in usage earlier this year, Daima began to devote all its energy to Craiyon. He says he focuses on using ads to keep users free because the site's servers are expensive. Craiyon has a Twitter account dedicated to posting the weirdest and most creative images, which has more than 1 million followers.

But the most enthusiastic project is Stable Diffusion, which was unveiled to the public in August. Its code is available on GitHub, can be run on a computer, can be run in the cloud or through a programming interface. This allows users to adjust the program code according to their own purpose, or build new programs based on it.

For example, Stable Diffusion is integrated into Adobe Photoshop through a plug-in that allows users to generate background and other parts of the image, and then they can use layers and other PS tools to operate directly in the application, turning generative AI from a technology for generating finished images into a tool that professionals can use.

Cantrell, the developer of the plug-in, who has worked at Adobe for 20 years, resigned this year to focus on generative AI. The veteran said the plug-in had been downloaded tens of thousands of times. Artists told him that they used it in countless unexpected places, such as animating Godzilla or creating images of Spider-Man in any pose the artist could imagine.

A new art of using generative AI is how to construct "hints", that is, phrases that generate images. A search engine called Lexica can concatenate images of Stable Diffusion with the exact word characters that can be used to generate them. On platforms such as Reddit and Discord, there are tips on how to guide people to enter phrases that they want to generate images.

Startups, cloud service providers and chipmakers may benefit most from seeing generative AI as a potentially transformative platform, just like in the early days of smartphones or the Internet. This shift has greatly expanded the potential market for the possible use of this technology.

Cantrell believes that generative AI is similar to a more basic technology, namely databases. "generative AI is a bit like a database, which helps unlock the great potential of applications," he said. "almost every application we use in our lives is built on a database, but no one cares how the database works, they just know how to use it."

Michael Dempsey (Michael Dempsey), managing partner of Compound VC, said the moment when lab-only technology entered the mainstream was "very rare" and attracted a lot of attention from venture capitalists, who like to bet on areas with great potential. But he warns that generative AI is now in a "curiosity phase" closer to the peak of the hype cycle. Companies at this stage may fail because they do not focus on specific uses that businesses or consumers are willing to pay.

Others in the field believe that the startups that pioneered these technologies today could eventually challenge the software giants that currently dominate AI, including Google, Facebook parent company Meta and Microsoft, and pave the way for the rise of the next generation of technology giants.

"there will be a large number of trillion-dollar new companies that will be based on this new technology," said Clement Delangue, chief executive of Hugging Face. Hugging Face is a developer platform similar to GitHub that hosts pre-trained AI models, including Craiyon and Stable Diffusio. Its goal is to make it easier for programmers to build AI technologies.

Some companies have received a lot of investment. Huging Face is valued at $2 billion after raising money from investors such as Lux Capital and Sequoia Capital earlier this year. OpenAI, the most famous start-up in the field, has raised more than $1 billion from Microsoft and Khosla Ventures. Meanwhile, Stable Diffusion developer Stability AI is in talks to raise venture capital at a valuation of up to $1 billion.

Cloud service providers such as Amazon, Microsoft and Google may also benefit, as generative AI can be computationally intensive. Meta and Google have hired a number of talented people in this field, hoping to integrate this advanced technology into the company's products. In September, Meta announced an AI project called "Make-A-Video" to take the technology to the next level by generating video rather than just images.

"this is an amazing step forward," Mark Mark Zuckerberg, chief executive of Meta, said in a post on his Facebook page. "generating videos is much more difficult than generating photos, because in addition to generating each pixel correctly, the system has to predict how they will change over time." Recently, Google also released code called Phenaki, which converts text into a few minutes of video.

The craze could also give a boost to chipmakers such as Nvidia, AMD and Intel, whose GPUs are ideal for training and deploying AI models. At last week's meeting, Nvidia CEO Huang Renxun stressed that generative AI was a key use of the company's latest chips, saying that such technologies could soon revolutionize communications.

However, the benefits of generative AI to end users are still limited. Many exciting things nowadays revolve around free or low-cost experiments. For example, some authors have tried to use image generators to illustrate articles. Nvidia is trying to use models to generate 3D images of new people, animals, vehicles or furniture that can be filled into the virtual game world.

Finally, everyone who develops a generative AI will have to work hard to solve the ethical problems brought about by the image generator.

First of all, the issue of employment. Although many programs require powerful GPUs, computer-generated content is still much cheaper than the time cost of professional illustrators, who can be paid hundreds of dollars an hour. Generative AI can cause big trouble for artists, video producers and others who make a living by creating works. Michael Dempsey, managing partner of Compound VC, said: "it turns out that machine learning models may become better, faster and cheaper than humans."

Generative AI also presents more complex challenges around originality and ownership. This AI model is trained using a large number of existing images, and whether the creator of the original image owns the copyright to the image generated in the original style is still under debate. An artist recently won an art competition in Colorado, using images created mainly by a generative AI called MidJourney. He said in an interview after the victory that he chose one of the hundreds of images he generated and then adjusted and processed it in PS.

Some images generated by Stable Diffusion appear to be watermarked, indicating that part of the original dataset is copyrighted. Some tip guidelines recommend that users use the names of specific living artists in order to achieve better results in the process of imitating the artist's creative style. Last month, Getty Images banned users from uploading generated AI images to its stock image database for fear of causing infringement disputes.

The image generator can also be used to create new images of trademark characters or targets, such as Minions, Marvel characters, or Thrones in Game of Thrones. As image generation software gets better and better, it can also trick users into believing false information or displaying images or videos of events that have never happened.

Developers must also grapple with the possibility that AI models based on large amounts of data training may include gender, race, or culture-related biases in the data, which may cause the model to show this bias in the output. Huging Face has published materials on ethical issues and discussed the development of AI models in a responsible manner.

"We see short-term and current challenges in these models because they are probabilistic models and training on big data sets tends to absorb a lot of biases," said Clement Delange, CEO of Hugging Face. " He said, for example, that generative AI had been asked to draw portraits of "software engineers" that resulted in images of white men.

"AI painting is caught in an Ethical dispute: creating Art or stealing Art"

"recently, AI painting is very popular, and game developers are already trying it."

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report