“gay men” was just my example to illustrate (Recall I said “imagine that...”).
There will always be things in “reality” that are not in the training data. Of course, more and more people are getting represented in the corpus of data. But, as theorists like Spivak point out, there will always be people left out or misrepresented. And as the latest research shows, even with perfect fidelity in the training data, GenAI suffers from mode collapse, leading to undesired homogenization.
I like to foreground the impact of this all has in the people in the margins. But this is a more general problem, one that has been identified as the main issue affecting the reliability of AI.
The point I am trying to make is that AI deals with these implicit counterfactuals in one way or another. As you pointed out, we do not want our AI to hallucinate, but we do want it to extrapolate and adapt outside its training if possible. Resolving this tension is not trivial.
“gay men” was just my example to illustrate (Recall I said “imagine that...”).
There will always be things in “reality” that are not in the training data.
Of course, more and more people are getting represented in the corpus of data.
But, as theorists like Spivak point out, there will always be people left out or misrepresented.
And as the latest research shows, even with perfect fidelity in the training data, GenAI suffers from mode collapse, leading to undesired homogenization.
I like to foreground the impact of this all has in the people in the margins.
But this is a more general problem, one that has been identified as the main issue affecting the reliability of AI.
The point I am trying to make is that AI deals with these implicit counterfactuals in one way or another.
As you pointed out, we do not want our AI to hallucinate, but we do want it to extrapolate and adapt outside its training if possible.
Resolving this tension is not trivial.