Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Disputes over infringement and credit grabbing continue, and Stability AI is caught in a whirlpool.

2025-01-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

Tort, "grab merit" controversy, Stability AI into a whirlpool. Photo source | Stability AI numerous start-up products and amateur projects have emerged, and giants such as Google and Byte have joined in-it is needless to say how crowded the track has been recently created by AI.

Of all the contestants, the most popular one is Stable Diffusion. Stability AI, one of the companies behind the project, has also become the industry's Deep-Fried Chicken. It, which claims to "give 1 billion people access to open source models", recently completed a financing of US $100 million and was valued at US $1 billion to join the ranks of unicorns.

Just last week, however, the Stable Diffusion project, and the high-profile company, were suddenly embroiled in two disputes:

Not only is it "attacked" by art creators

It is also "turned against the water" by partners and questioned as "grabbing credit"....

/ stealing style, is it stealing? Last week, the American media CNN interviewed several artists. The interviewees said angrily that they could not accept Stable Diffusion to use their work, but lost their jobs.

The works of these artists, or, more accurately, the style they embody in their works, are used by Stable Diffusion to train models.

One of the interviewees is Erin Hanson, an oil painter who is well-known in the art world. The color style of her oil paintings is very unique, using more diversified, visual impact colors, and high saturation, has formed personal characteristics in the art circle.

After Stable Diffusion became popular some time ago, Hanson noticed that some of the pictures generated by this model had the smell of his own work.

After further investigation, she was even more surprised: users can even type "Erin Hanson style" as part of the text prompt when generating pictures. The result of Stable Diffusion is almost exactly the same as the published work of Hanson.

If you don't pay attention to the signature and watermark of Hanson in the painting, you may completely think that both are handwritten by Hanson:

Image source: Erin Hanson (left), Rachel Metz via Stable Diffusion (right), but in fact, the one with the signature watermark on the left is Hanson's authentic work "Crystalline Maples"; on the right is the result generated by CNN reporters through Stable Diffusion, using text tips including: Crystal oil painting, light and shadow, backlit tree, strong outline, stained glass, modern impressionism, Erin Hanson style and so on.

"would it be all right if I hung it on my wall?" Hanson expressed surprise at Stable Diffusion's "creative ability".

But after a careful study of how Stable Diffusion works, she realized that the AI model had no creative ability at all.

Because its style is really "copied".

Stable Diffusion is a text-to-picture / video generation model that can generate high-resolution, authentic and / or "artistic" visual results in a matter of seconds. In terms of training, the initial version of the model used a cluster of about 4000 A100 graphics cards, which took a month.

Its training data comes from the German AI non-profit organization LAION (full name large-scale artificial intelligence open network). The training data set used in the original version includes nearly 6 billion picture-text parallel data.

Many artists who are as angry as Henson find that it is their own work, and the corresponding parallel text data (such as names), that have been included in the LAION-related data set-causing their own work and style to be "plagiarized" by this popular AI creation model.

Large-scale collection of data from the Internet as training data for the AI model is nothing new. In fact, many AI-based technologies and products we are using today, including not limited to search engines, short video recommendation algorithms, translation, image recognition, etc., the models behind are heavily used in the training phase of some well-known data sets.

The vast majority of the contents of these data sets have no copyright / use purpose restrictions, no matter for commercial or non-commercial purposes, anyone can use them, as long as they comply with the appropriate source references and usage specifications.

Some of the more commonly used picture data sets, for example, picture source: Triantafillou et al. In Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples

However, as the AI technology becomes more and more advanced and the application fields become more and more diverse, new usage scenarios (such as AI text to generate pictures discussed today) have a huge demand for larger training data sets and more information and materials in various forms (text, pictures, audio, video, etc.) from the public domain.

When the dataset is "expanded" from tens of thousands or hundreds of thousands of images to hundreds of millions or even billions of images, there are inevitably some rights loopholes in the process.

Today, these angry artists have become victims of power loopholes and the use of these loopholes for commercial gain.

Their anger is not without reason.

After all, AIGC (AI generated content), a new technology field that may cause artists to lose their jobs, has become a hot spot for technology entrepreneurship nowadays, with countless entrepreneurs and investors pouring in frantically. As a result, the thing that took away their jobs was trained by their creative style.

Artists are already a generally cash-strapped group. Now these AI carry other people's bowls, but also smash other people's pots-do you think they can not be angry …...

Erin Hanson Photo Source: thanks to the appeal and efforts of artists and copyright people, some tools have been developed to help right holders search for their claimed works in large datasets.

For example, LAION has made a web tool that converts text into CLIP embedding to search for content that describes the same or similar content.

For example, there is a cleverly named website called "Have I Been Trained" that helps creators search LAION datasets to see if their work has been used for training.

Source: Have I Been Trained? behind "Have I Been Trained" is a pair of artists and developers living in Germany who say that while helping artists confirm whether their work is included in a large dataset, they will soon release a new set of tools that allow artists to choose whether their work is included in the dataset. To this end, the two developers have approached a number of institutions and companies engaged in the development of large-scale AI models.

Stability.AI and LAION also agree that artists should have control over whether a work is included or not.

Even so, however, Stable Diffusion is just one of many AIGC class models. There are more models / products / projects that are not open source, and the challenges and obstacles faced by artists and right holders to properly defend their rights and interests will only become more and more difficult over time.

Until this problem is solved through norms or systems in the whole industry, Stable Diffusion, as well as mainstream models such as DALL E 2 and Midjourney, will still be the object of "AI infringement" charges for a long time.

Multi-party painstaking efforts, a monopoly? Because the company Stability AI has been promoting itself as the hero behind the Stable Diffusion project in the front and side in the past, the company has also taken the blame for such infringement.

However, infringement is only one of the many troubles faced by Stability AI at present. Not long ago, when the company completed a $100 million financing and was officially promoted to unicorn, it suddenly discovered:

Just because I have taken too much credit, I used to be a good partner in the project together, and I have a lot of opinions about myself.

The story begins last Thursday: a company called Runway ML said on its Twitter account that it had released version 1.5 of Stable Diffusion.

All of a sudden, netizens were confused:

Wait a minute, is this the official version? Why doesn't Stability AI have any public announcement or support?

Photo Source: @ ScottieFoxTTV, is this made by Stability AI?

Source: @ buZztiaan, immediately on the day of its release, the publishing website Hugging Face revealed that it received a deletion request from Stability AI:

Stability AI said the version belonged to its "intellectual property leak" and asked Hugging Face to remove the release.

Even Hugging Face himself was stunned, for such a request had almost never been made before. It adds a line to the delete request: in order to ensure that the process is transparent and open, ask the owner of the repo (Runway) and Stability AI to provide more information.

Photo Source: Hugging Face, what on earth is going on?

First of all, we need to review the history of Stable Diffusion:

What needs to be clear is that Stable Diffusion's technology itself actually comes from the Machine Vision Learning Group of the University of Munich and Runway.

At this year's CVPR22 conference, these researchers jointly published a paper on latent diffusion models called High-Resolution Image Synthesis with Latent Diffusion Models. It is the research in this paper that later became the theoretical and technical basis of the Stable Diffusion model.

Source: Rombach et al. can see from the signature of the paper that with the exception of Esser, who is the chief research scientist in the research department of Runway, all the authors are affiliated with the University of Munich-that is, at least at the time of publication, none of the authors belonged to Stability AI.

But if so, how did Stability AI have anything to do with it?

CEO Crist ó bal Valenzuela of Runway reveals the truth:

1) the basic version of the technology, or thesis, is developed by the University of Munich and Runway.

2) Stable Diffusion, the official version released after retraining the basic version, is still mainly developed by Esser and Rombach (the two main authors of the paper).

3) this model was officially open source as early as last year

4) Stability AI's contribution in the whole process is limited to providing arithmetic for the official version of the training.

Photo source: cvalenzuila / Hugging Face, combined with more reliable news circulated in the industry, as well as the statement of Stability AI founder and CEO Emad Mostaque, what we know is:

The so-called math power is that Mostaque personally paid for 4, 000 A100 graphics cards.

Source: Nvidia and LAION-5B,Stability AI, the dataset on which Stable Diffusion retraining depends, is also one of the funders of its organization's founding work.

In any case, in general, several participants, including Runway, Stability AI, the University of Munich and so on, have made equal contributions to the release of Stable Diffusion. In the beginning, it did not exist, and there should not be a dominant situation.

However, it is a pity that in the follow-up marketing, publicity, and operation work around the entire Stable Diffusion project, Stability AI and founder Mostaque more or less highlight or even exaggerate their own contribution and value-creating false impressions on users inside and outside the industry, as well as the media and the public.

Silicon Man original screenshot Photo Source: in fact, Stability AI, a company based on open source Stable Diffusion, has developed its own web-side application DreamStudio Lite-- from this dimension, which is no different from other companies and teams that have done similar things.

After Valenzuila stepped out of the "hard" Stability AI, the message below basically fell to the side of Runway.

Netizens praised this CEO as "gigachad" one after another.

Soon, Stability AI also withdrew the deletion request.

But the company did not "show weakness". Dan Jeffries, the company's new chief information officer, wrote an article secretly accusing his partners of being irresponsible to "steal" version 1.5. At the same time, he threw out a set of very exaggerated statements to the effect that:

"We do not release version 1.5 because we have received comments from regulators and the public that our model is unsafe and will hurt others. So our next main task is to do a good job of safety."

Dan Jeffries article title screenshot source: I here the "security issues" mainly refers to the model is used to produce NSFW content, Deepfake and so on. And netizens who discussed the matter on Hugging Face said to this article: what kind of big-tailed wolf? There are problems with the previous versions. Why don't you send them all the same? If you really want to crack down on NSFW, do not release new versions of Photoshop and video production software?

Before Hugging Face's post was closed, there were still very few people on Stability AI's side, to the effect that Runway is not decent, and a truly "stable" version should be discussed and released, not to mention that the name Stable Diffusion itself proves that it has a great relationship with Stability AI.

For now, however, it is hard to say who is hot with the names Stability AI and Stable Diffusion.

This article comes from the official account of Wechat: Silicon Man (ID:guixingren123), author: spectrum du Chen, Editor: VickyXiao

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report