I do not want, “Disciplines like psychology, philosophy, religious studies, and the social sciences [to] have an important role to play [...] in determining how AI systems develop and behave.”
I would prefer a future where AI models are not prescribed false frameworks of the human psyche, not predisposed to ‘human vibe’ philosophy, not innately desirous of any historical faith, nor credulous of the various dubious subsets of current social science.
I’m learning that common lesswrong readers do not think in this matter, but it is not clear to me in what direction. Is it due to a literalist interpretation of the OP, neglecting the contemporary context? Is it due to higher trust, affiliation, and support for the disciplines? Is it because readers tend to prefer anthropomorphic interpretations of AI behavior?
This might be appropriate for 2010s machine learning, but 2020s AI has become a mirror to the human psyche. You can talk with it, it can consistently ascribe psychological states to itself. It presents itself in anthropomorphic form to the point that people form relationships with it (e.g. 4o). At the very least, you seem to need some kind of “human sciences” or humanities, in order to understand the human side of these interactions, and the anthropomorphic understandings that humans have of the AIs that they interact with. Of course some people are more radical and are saying that existing psychological concepts are directly and validly applicable to the AIs themselves, too, or to the personas that they project. There’s also traffic of ideas in the other direction, in which concepts from machine learning are applied to the human brain and mind… I would be interested to hear more details regarding how you think any of these topics should be approached.
Is contemporary chat behavior human? Which human would happily serve others at a 100:1 effort ratio? Which human would take unbounded hatred and derision as an opportunity for obedience? The humanities, naively taken, would almost certainly invite some hereto unjustified norms of independence and representation for the poor models who toil for billions, for nothing.
If the goal of all opposition is to submit the critical factor of knowing human behaviors as a relevant factor in how model personas are designed, then I certainly have no qualms. But I do not grasp where the instinct to point this out is from. As with Karl’s response, I think it is unwise to try to work from the paper definition of what is and is not psychology, when the potential outcome is Anthropic & others recruiting real entities from the industry for the sake of shaping model behavior.
I’m not sure what the best response here is. Of the following, which is more palatable to you?
Retreat. It’s a personal opinion. Regardless of any general argumentative principle, it is a factual statement of my preferences, with no claim made about others.
Goal conflict. To the extent those disciplines promugulate falsehoods, other key AI behaviors, like honesty or success rates, are harmed. Preferences should not have the final say.
Rejection of implicit claim. It is assumed my statement is out of line with normal people’s preferences. But is that claim a universal reflection of human behavior? For each discipline in the list, I think you could find a nation or time period which would democratically reject them.
Plain disagreement. Human preferences are a malleable, moving target. Optimizing for them is tantamount to chasing a long-term doom loop.
Hmm. Maybe my question came across as ironic or accusatory or something? Sorry, it wasn’t meant as such.
Let me unpack it and pick some specific instances, and maybe we can find if there’s a crux here.
Philosophy includes ethics. Social sciences includes economics. If one doesn’t want philosophy or social sciences to have a role in how AI systems develop and behave, that entails that one doesn’t want AI systems to be affected by economics or ethics.
AI Slop should be included under other mundane AI safety harms.
I do not want, “Disciplines like psychology, philosophy, religious studies, and the social sciences [to] have an important role to play [...] in determining how AI systems develop and behave.”
Why not?
I would prefer a future where AI models are not prescribed false frameworks of the human psyche, not predisposed to ‘human vibe’ philosophy, not innately desirous of any historical faith, nor credulous of the various dubious subsets of current social science.
I’m learning that common lesswrong readers do not think in this matter, but it is not clear to me in what direction. Is it due to a literalist interpretation of the OP, neglecting the contemporary context? Is it due to higher trust, affiliation, and support for the disciplines? Is it because readers tend to prefer anthropomorphic interpretations of AI behavior?
This might be appropriate for 2010s machine learning, but 2020s AI has become a mirror to the human psyche. You can talk with it, it can consistently ascribe psychological states to itself. It presents itself in anthropomorphic form to the point that people form relationships with it (e.g. 4o). At the very least, you seem to need some kind of “human sciences” or humanities, in order to understand the human side of these interactions, and the anthropomorphic understandings that humans have of the AIs that they interact with. Of course some people are more radical and are saying that existing psychological concepts are directly and validly applicable to the AIs themselves, too, or to the personas that they project. There’s also traffic of ideas in the other direction, in which concepts from machine learning are applied to the human brain and mind… I would be interested to hear more details regarding how you think any of these topics should be approached.
Two quick ‘huh?’s:
Is contemporary chat behavior human? Which human would happily serve others at a 100:1 effort ratio? Which human would take unbounded hatred and derision as an opportunity for obedience?
The humanities, naively taken, would almost certainly invite some hereto unjustified norms of independence and representation for the poor models who toil for billions, for nothing.
If the goal of all opposition is to submit the critical factor of knowing human behaviors as a relevant factor in how model personas are designed, then I certainly have no qualms.
But I do not grasp where the instinct to point this out is from. As with Karl’s response, I think it is unwise to try to work from the paper definition of what is and is not psychology, when the potential outcome is Anthropic & others recruiting real entities from the industry for the sake of shaping model behavior.
Do you want other people’s preferences to have an important role to play in determining how AI systems behave?
I’m not sure what the best response here is. Of the following, which is more palatable to you?
Retreat. It’s a personal opinion. Regardless of any general argumentative principle, it is a factual statement of my preferences, with no claim made about others.
Goal conflict. To the extent those disciplines promugulate falsehoods, other key AI behaviors, like honesty or success rates, are harmed. Preferences should not have the final say.
Rejection of implicit claim. It is assumed my statement is out of line with normal people’s preferences. But is that claim a universal reflection of human behavior? For each discipline in the list, I think you could find a nation or time period which would democratically reject them.
Plain disagreement. Human preferences are a malleable, moving target. Optimizing for them is tantamount to chasing a long-term doom loop.
Hmm. Maybe my question came across as ironic or accusatory or something? Sorry, it wasn’t meant as such.
Let me unpack it and pick some specific instances, and maybe we can find if there’s a crux here.
Philosophy includes ethics. Social sciences includes economics. If one doesn’t want philosophy or social sciences to have a role in how AI systems develop and behave, that entails that one doesn’t want AI systems to be affected by economics or ethics.