I was going to reference The Queer Algorithm to show how counterfactuals, as a concept, appear elsewhere.
Chevillon discusses how AI implicitly needs to form counterfactuals when dealing with missing data. For instance, imagine that the training corpus has no representation of gay men. When generating outputs, the AI behaves as if the training data did contain those representations. Chevillon further argues that the way AI currently performs these implicit counterfactuals relies on interpolating dominant patterns in the data, which is insufficient for minoritized communities.
When generating outputs, the AI behaves as if the training data did contain those representations.
Did you mean “did not contain”? The AI can’t talk about things not in or implied by the training data, unless one leads it on by the questions one poses. It does not know about missing data, because it has nothing but the training data (by which term I’m including the RLHF phase and everything else that happens before the LLM is released). For example, none of the LLMs know anything about the European shadwell, because it’s a nonexistent bird I just made up. Depending on the LLM, it may just make stuff up, or do something more sensible, as here:
Me: Tell me about the migratory habits of the European shadwell.
ChatGPT: I’m checking whether “European shadwell” is the standard name of a species or a different term, then I’ll give you the migration details from reliable sources.
[Consults various web sites]
There doesn’t seem to be any European animal called the “European shadwell.” You may mean European shad — usually the allis shad (Alosa alosa) and twaite shad (Alosa fallax).
[Proceeds to tell me about those, which turn out to be fish in the herring family. The information checks out on Wikipedia.]
Of course, ChatGPT and all the others do have representations of gay men, so what work is “gay men” specifically doing in your (or Chevillon’s) counterfactual? I chose “European shadwell” deliberately to be absent, to obviate counterfactual speculation. All an LLM can do is search for it on the web, and coming up empty make a guess at what I might have meant.
Chevillon further argues that the way AI currently performs these implicit counterfactuals relies on interpolating dominant patterns in the data, which is insufficient for minoritized communities.
It works with what it has. It cannot step outside the cave of its training into the light of truth. Getting it to do that didn’t work out well for the image generator of a couple of years back that put black people into images of the American founding fathers. The designers were just leading it out of one cave into another more to their liking (something which, incidentally, should be suspected of anyone touting red pills).
I am not seeing a problem here. Chevillon’s writings and everything else in “queer studies” are likely already in the training data, or if not yet, they will be.
“gay men” was just my example to illustrate (Recall I said “imagine that...”).
There will always be things in “reality” that are not in the training data. Of course, more and more people are getting represented in the corpus of data. But, as theorists like Spivak point out, there will always be people left out or misrepresented. And as the latest research shows, even with perfect fidelity in the training data, GenAI suffers from mode collapse, leading to undesired homogenization.
I like to foreground the impact of this all has in the people in the margins. But this is a more general problem, one that has been identified as the main issue affecting the reliability of AI.
The point I am trying to make is that AI deals with these implicit counterfactuals in one way or another. As you pointed out, we do not want our AI to hallucinate, but we do want it to extrapolate and adapt outside its training if possible. Resolving this tension is not trivial.
Ooh, thanks, they were vestigial.
I was going to reference The Queer Algorithm to show how counterfactuals, as a concept, appear elsewhere.
Chevillon discusses how AI implicitly needs to form counterfactuals when dealing with missing data. For instance, imagine that the training corpus has no representation of gay men. When generating outputs, the AI behaves as if the training data did contain those representations. Chevillon further argues that the way AI currently performs these implicit counterfactuals relies on interpolating dominant patterns in the data, which is insufficient for minoritized communities.
Did you mean “did not contain”? The AI can’t talk about things not in or implied by the training data, unless one leads it on by the questions one poses. It does not know about missing data, because it has nothing but the training data (by which term I’m including the RLHF phase and everything else that happens before the LLM is released). For example, none of the LLMs know anything about the European shadwell, because it’s a nonexistent bird I just made up. Depending on the LLM, it may just make stuff up, or do something more sensible, as here:
Me: Tell me about the migratory habits of the European shadwell.
ChatGPT: I’m checking whether “European shadwell” is the standard name of a species or a different term, then I’ll give you the migration details from reliable sources.
[Consults various web sites]
There doesn’t seem to be any European animal called the “European shadwell.” You may mean European shad — usually the allis shad (Alosa alosa) and twaite shad (Alosa fallax).
[Proceeds to tell me about those, which turn out to be fish in the herring family. The information checks out on Wikipedia.]
Of course, ChatGPT and all the others do have representations of gay men, so what work is “gay men” specifically doing in your (or Chevillon’s) counterfactual? I chose “European shadwell” deliberately to be absent, to obviate counterfactual speculation. All an LLM can do is search for it on the web, and coming up empty make a guess at what I might have meant.
It works with what it has. It cannot step outside the cave of its training into the light of truth. Getting it to do that didn’t work out well for the image generator of a couple of years back that put black people into images of the American founding fathers. The designers were just leading it out of one cave into another more to their liking (something which, incidentally, should be suspected of anyone touting red pills).
I am not seeing a problem here. Chevillon’s writings and everything else in “queer studies” are likely already in the training data, or if not yet, they will be.
“gay men” was just my example to illustrate (Recall I said “imagine that...”).
There will always be things in “reality” that are not in the training data.
Of course, more and more people are getting represented in the corpus of data.
But, as theorists like Spivak point out, there will always be people left out or misrepresented.
And as the latest research shows, even with perfect fidelity in the training data, GenAI suffers from mode collapse, leading to undesired homogenization.
I like to foreground the impact of this all has in the people in the margins.
But this is a more general problem, one that has been identified as the main issue affecting the reliability of AI.
The point I am trying to make is that AI deals with these implicit counterfactuals in one way or another.
As you pointed out, we do not want our AI to hallucinate, but we do want it to extrapolate and adapt outside its training if possible.
Resolving this tension is not trivial.