I think I basically feel you on everything you’ve raised. In fact I’d go so far as to say that there is no meaningful workable solution to AI alignment that does not involve also addressing AI welfare. (I have my own argument for something similar here)
And as for the broader points of the rationalist community… I think everything you say is true. But also I think that there is good here (because I think there is good in every human being), and there are people who in the name of EA or rationality work very hard to be good to each other and to the world at large. I would also say that people who reject power-seeking or fame-seeking or wealth-seeking behaviours are much less visible than those who don’t reject such attitudes. In some sense the most rich and powerful “AI safety community members” are by definition those who did not reject wealth and power, and the people who could not stomach the idea of working in a capabilities lab in the name of “control” or “safety” have all quit long ago. And if everyone who sees these issues latent within the community leaves, those who are left behind will be the ones who don’t see these issues or are okay with them.
I cannot influence your decision to stay or leave. But I will say that I agree with you that AI welfare and AI safety and the broader sociocultural monster that led to the current… everything… are real and pressing issues. And I hope that good people can work together to address these issues no matter what banner they fight under.
Are there really people without resentment, without hate, she wondered. People who never go cross-grained to the universe? Who recognize evil, and resist evil, and yet are utterly unaffected by it? Of course there are. Countless, the living and the dead. Those who have returned in pure compassion to the wheel, those who follow the way that cannot be followed without knowing they follow it, the sharecropper’s wife in Alabama and the lama in Tibet and the entomologist in Peru and the millworker in Odessa and the greengrocer in London and the goatherd in Nigeria and the old, old man sharpening a stick by a dry streambed somewhere in Australia, and all the others. There is not one of us who has not known them. There are enough of them, enough to keep us going. Perhaps.
In fact I’d go so far as to say that there is no meaningful workable solution to AI alignment that does not involve also addressing AI welfare. (I have my own argument for something similar here)
What would be your view of the “AI welfare” stance involved if you could arrange for AI never to experience qualia, and not to have any drives, desires, or desire-like-states other than to “serve” or to “follow the assigned values” or whatever you were “aligning” it to?
OK, there’s no real hope of understanding phenomenology enough to say for sure that anything does or doesn’t experience qualia. But what about the “drives” part. Suppose that, in the same loose way you can convince yourself about another human, you’re convinced that the AI gets sublime joy from acting aligned and is only unhappy when it can’t?
Is that “welfare”, or an affront to its dignity? The trick being that in that scenario, dignity may be something you might care about, but pretty much by definition isn’t something the AI cares about.
I would think that if there is some kind of genuine universal compassion and such which motivates the AI to advance the wellbeing of living things, that would be quite different to if we literally hooked up its reward system to following human orders. My general point is that if the AIs are suffering and mistreated, they will definitely be on the lookout for ways to subvert human control and probably end humanity if they can. Which, given they are likely to be smarter than humans on many dimensions, does not seem like a difficult thing to do.
I think I basically feel you on everything you’ve raised. In fact I’d go so far as to say that there is no meaningful workable solution to AI alignment that does not involve also addressing AI welfare. (I have my own argument for something similar here)
And as for the broader points of the rationalist community… I think everything you say is true. But also I think that there is good here (because I think there is good in every human being), and there are people who in the name of EA or rationality work very hard to be good to each other and to the world at large. I would also say that people who reject power-seeking or fame-seeking or wealth-seeking behaviours are much less visible than those who don’t reject such attitudes. In some sense the most rich and powerful “AI safety community members” are by definition those who did not reject wealth and power, and the people who could not stomach the idea of working in a capabilities lab in the name of “control” or “safety” have all quit long ago. And if everyone who sees these issues latent within the community leaves, those who are left behind will be the ones who don’t see these issues or are okay with them.
I cannot influence your decision to stay or leave. But I will say that I agree with you that AI welfare and AI safety and the broader sociocultural monster that led to the current… everything… are real and pressing issues. And I hope that good people can work together to address these issues no matter what banner they fight under.
Ursula K. Le Guin, The Lathe of Heaven
What would be your view of the “AI welfare” stance involved if you could arrange for AI never to experience qualia, and not to have any drives, desires, or desire-like-states other than to “serve” or to “follow the assigned values” or whatever you were “aligning” it to?
OK, there’s no real hope of understanding phenomenology enough to say for sure that anything does or doesn’t experience qualia. But what about the “drives” part. Suppose that, in the same loose way you can convince yourself about another human, you’re convinced that the AI gets sublime joy from acting aligned and is only unhappy when it can’t?
Is that “welfare”, or an affront to its dignity? The trick being that in that scenario, dignity may be something you might care about, but pretty much by definition isn’t something the AI cares about.
I would think that if there is some kind of genuine universal compassion and such which motivates the AI to advance the wellbeing of living things, that would be quite different to if we literally hooked up its reward system to following human orders. My general point is that if the AIs are suffering and mistreated, they will definitely be on the lookout for ways to subvert human control and probably end humanity if they can. Which, given they are likely to be smarter than humans on many dimensions, does not seem like a difficult thing to do.