A new tool from researchers at the University of Chicago promises to protect art from being hoovered up by AI models and used for training without permission by «poisoning» image data.
Known as Nightshade, the tool tweaks digital image data in ways that are claimed to be invisible to the human eye but cause all kinds of borkage for generative training models, such as DALL-E, Midjourney, and Stable Diffusion.
The technique, known as data poisoning, claims to introduce «unexpected behaviors into machine learning models at training time.» The University of Chicago team claim their research paper shows such poisoning attacks can be «surprisingly» successful.
Apparently, the poison samples images look «visually identical» to benign images. It's claimed the Nightshade poison samples are «optimized for potency» and can corrupt an Stable Diffusion SDXL prompt in fewer than 100 poison samples.
The specifics of how the technology works isn't entirely clear, but involves altering image pixels in ways that are invisible to the human eye while causing the machine-learning models to misinterpret the content. It's claimed that the poisoned data is very difficult to remove, with the implication that each poisoned image must be manually identified and removed from the model.
Using Stable Diffusion as a test subject, the researchers found that it took just 300 poison samples to confuse the model into think a dog was a cat or a hat is a cake. Or is it the other way round?
Anyway, they also say that the impact of the poisoned images can extend to related concepts, allowing a moderate number of Nightshade attacks to «destabilize general features in a text-to-image generative model, effectively disabling its ability to generate meaningful
Read more on pcgamer.com