I find that studies criticizing current models are often used long after the issue is fixed, or without consideration to the actual meaning. I would wish that technology reporting is more careful, as much of this misunderstanding seems to come from journalistic sources. Examples:
Hands in diffusion models
Text in diffusion models
Water usage
Model collapse—not an issue for actual commercial AI models, the original study was about synthetic data production, and directly feeding the output of models as the exclusive training data
LLMs = Autocorrect—chat models have RLHF post training
Nightshade/glaze: useless for modern training methods
AI understanding—yes, the weights are not understood, but the overall architecture is
It is surprising how many times I hear these, with false context.
There are indeed many, many silly claims out there, on either side of any debate. And yes, the people pretending that the AIs of 2025 have the limitations of those from 2020 are being silly, journalist or no.
I do want to clarify that I don’t think this is a (tech) journalist problem. Presumably when you mention Nightshade dismissively, it’s a combination of two reasons: 1) Nightshade artefacts are removable via small amounts of Gaussian blur and 2) Nightshade can’t be deployed at scale on enough archetypal images to have a real effect? If you look at the Nightshade website, you’ll see that the authors lie about 1):
As with Glaze, Nightshade effects are robust to normal changes one might apply to an image. You can crop it, resample it, compress it, smooth out pixels, or add noise, and the effects of the poison will remain.
So (assuming my recollection that Nightshade is defeatable by Gaussian noise is correct) this isn’t an issue of journalists making stuff up or misunderstanding what the authors said, it’s the authors putting things in their press release that, at the very least, are not at all backed up by their paper.
(Also, either way, Gary Marcus is not a tech journalist!)
I find that studies criticizing current models are often used long after the issue is fixed, or without consideration to the actual meaning. I would wish that technology reporting is more careful, as much of this misunderstanding seems to come from journalistic sources. Examples:
Hands in diffusion models
Text in diffusion models
Water usage
Model collapse—not an issue for actual commercial AI models, the original study was about synthetic data production, and directly feeding the output of models as the exclusive training data
LLMs = Autocorrect—chat models have RLHF post training
Nightshade/glaze: useless for modern training methods
AI understanding—yes, the weights are not understood, but the overall architecture is
It is surprising how many times I hear these, with false context.
There are indeed many, many silly claims out there, on either side of any debate. And yes, the people pretending that the AIs of 2025 have the limitations of those from 2020 are being silly, journalist or no.
I do want to clarify that I don’t think this is a (tech) journalist problem. Presumably when you mention Nightshade dismissively, it’s a combination of two reasons: 1) Nightshade artefacts are removable via small amounts of Gaussian blur and 2) Nightshade can’t be deployed at scale on enough archetypal images to have a real effect? If you look at the Nightshade website, you’ll see that the authors lie about 1):
So (assuming my recollection that Nightshade is defeatable by Gaussian noise is correct) this isn’t an issue of journalists making stuff up or misunderstanding what the authors said, it’s the authors putting things in their press release that, at the very least, are not at all backed up by their paper.
(Also, either way, Gary Marcus is not a tech journalist!)