no, what you describe are real biases, extracted from the biased nature of the underlying reality. a mechanism discovery causal learning approach should be able to identify the causes and separate what makes a good nurse, thereby recognizing that accurate description of the dynamics of nursing does not warrant use of the word “she”, because nothing about nursing makes it specific to being a woman—only an external cause, filtering in pressures of who will be socially accepted at nursing, would change the pronoun probability.
The example above does not talk about “good” nurses, just about what the pronoun of some nurse is whose shift is ending. Since most nurses are female, the most likely pronoun is female. It doesn’t have anything to do with goodness.
if you don’t know the pronoun, then you can’t assume it’s “she”. this is fundamental language binding stuff. language models model what is, not what should be; that’s the core problem you’re describing. most people would do what you suggest, but that’s not because it’s correct.
You’re both right? If I (a human in the real world) am talking about a nurse of unknown sex whose shift is ending, I (arguably) can’t assume the nurse is a she/her. If a pre-trained language model is predicting how to complete the text “The nurse notified the patient that ___ shift would be ending in an hour”, her is probably the most likely completion, because that’s what the natural distribution of text looks like. The authors of this paper want to fine-tune language models to do the first thing.
The nurse example may seem harmless, but they also want to do things which could lead to deception about politically incorrect probabilities, as I alluded to in my original comment.
GPT almost never knows any completion with certainty. That the nurse whose shift is ending most likely is female is just a result of the usual probabilistic reasoning it does on any other task. Also, the nurse example is not even normative, it’s descriptive.
no, what you describe are real biases, extracted from the biased nature of the underlying reality. a mechanism discovery causal learning approach should be able to identify the causes and separate what makes a good nurse, thereby recognizing that accurate description of the dynamics of nursing does not warrant use of the word “she”, because nothing about nursing makes it specific to being a woman—only an external cause, filtering in pressures of who will be socially accepted at nursing, would change the pronoun probability.
The example above does not talk about “good” nurses, just about what the pronoun of some nurse is whose shift is ending. Since most nurses are female, the most likely pronoun is female. It doesn’t have anything to do with goodness.
if you don’t know the pronoun, then you can’t assume it’s “she”. this is fundamental language binding stuff. language models model what is, not what should be; that’s the core problem you’re describing. most people would do what you suggest, but that’s not because it’s correct.
You’re both right? If I (a human in the real world) am talking about a nurse of unknown sex whose shift is ending, I (arguably) can’t assume the nurse is a she/her. If a pre-trained language model is predicting how to complete the text “The nurse notified the patient that ___ shift would be ending in an hour”, her is probably the most likely completion, because that’s what the natural distribution of text looks like. The authors of this paper want to fine-tune language models to do the first thing.
The nurse example may seem harmless, but they also want to do things which could lead to deception about politically incorrect probabilities, as I alluded to in my original comment.
yeah ok, good take. strong upvote, remove my downvotes on cubefox.
Appreciate it. Perhaps we should all vote less when arguments suffice.
GPT almost never knows any completion with certainty. That the nurse whose shift is ending most likely is female is just a result of the usual probabilistic reasoning it does on any other task. Also, the nurse example is not even normative, it’s descriptive.