Say, I’m too old to expect aligned AI to give me eternal life (or aligned AI simply might not mean eternal life/bliss for me, for whichever reason; maybe as it’s still better to start with newborns more efficiently made into bliss-enjoying automatons or whatever utopia entails), so for me individually, the intermediate years before superintelligence are the relevant ones, so I might rationally want to earn money by working on enriching myself, whatever the (un-)alignment impact of it
I expect that the set of people who:
Expect to have died of old age within five years
Are willing to reduce how long they’ll expected-live in order to be richer before they die
Are willing to sacrifice all of humanity’s future (including the future of their loved-ones who aren’t expected to die of old age within five years)
Take actions who impact the what superintelligence is built
is extremely small. It’s not like Sam Altman is 70.
Given the public goods nature of alignment, I might fear it’s unlikely for us to cooperate, so we’ll all freeride by working to enrich ourselves working on various things rather leaning towards building unaligned AI. With such a prior, it may be rational for any self-interested person—also absent confusion—to indeed freeride: ‘hopefully I make a bit of money for an enjoyable few or many years, before unaligned AGI destroys us with near-certainty anyway’.
Even if technical alignment was not too difficult, standard Moloch-type effects (again, all sorts of freeriding/power seeking) might mean chances of unaligned users of even otherwise ‘aligned’ technology are overwhelming, again meaning most value for most people lies in increasing their material welfare for the next few years to come, rather than ‘wasting’ their resources towards an futile alignment project.
But it’s not public-goods kind of thing. If they knew that the choice was between:
Rich now, dead in five years
Less rich now, post-scarcely-rich and immortal and in utopia forever in five years
then pretty much nobody would still choose the former. If people realized the truth, they would choose otherwise.
In my experience, most of the selfishness people claim to have to justify continuing to destroy the world instead of helping alignment is less {because that’s their actual core values and they’re acting rationally} and more just {finding excuses to not have to think about the problem and change their minds/actions}. I talk more about this here.
To be fair, this is less a failure of {being wrong about this specific thing}, and more a failure of {being less good at rationality in general}. But it’s still mistake-theoritic moreso than conflict-theoritic.
Are willing to reduce how long they’ll expected-live in order to be richer before they die
Are willing to sacrifice all of humanity’s future (including the future of their loved-ones who aren’t expected to die of old age within five years)
Take actions who impact the what superintelligence is built
is extremely small.
It would be extremely small if we’d be talking about binaries/pure certainty.
If in reality, everything is uncertain, and in particular (as I think), everyone has individually a tiny probability of changing the outcome, everyone ends up free-riding.
This is true for the commoner[1] who’s using ChatGPT or whichever cheapest & fastest AI tool he finds for him to succeed in his work, therefore supporting the AI race and “Take actions who impact the what superintelligence is built”.
It may also be true for CEOs of many AI companies. Yes their distopia-probability-impact is larger, but equally so do their own career, status, power—and future position within the potential new society, see jacob_cannell’s comment—depend more strongly hinge on their action.
(Imperfect illustrative analogy: Climate change may kill a hundred million people or so, the being called human will tend to fly around the world, heating it up. Would anyone be willing to “sacrifice” hundred million people for her trip to Bali? I have some hope they wouldn’t. But, she’ll not avoid the holiday if her probability of avoiding disastrous climate change anyway is tiny. And if instead of her holiday, her entire career, fame, power depended on her to continue polluting, even if she was a global scale polluter, she’d likely enough not stop emitting for the sake of changing. I think we clearly must acknowledge this type of public good/freerider dynamics in the AI domain.
***
In my experience, most of the selfishness people claim to have to justify continuing to destroy the world instead of helping alignment is less {because that’s their actual core values and they’re acting rationally} and more just {finding excuses to not have to think about the problem and change their minds/actions}.
Agree with a lot in this, but w/o changing my interpretation much: Yes, humans are good in rationalization of their bad actions indeed. But they’re especially good at it when it’s in their egoistic interest to continue the bad thing. So both the commoner and the AI CEO alike, might well rationalize ‘for complicated reason it’s fine for the world if we (one way or another) heat up the AI race a bit’ in irrational ways—really as they might rightly see it in their own material interest to be continuing to do so, and want to make their own brain & others see them as good persons nevertheless.
Btw, I agree the situation is a bit different for commoners vs. Sam Altman & co. I read your post as being about persons in general, even people who are merely using the AI tools and therefore economically influence the domain via the market forces. If that was not only my wrong reading, then you might simplify the discussion if you edit your post to refer to those with significant probability of making a difference (I interpret your reply in that way; though I also don’t think the result changes much, as I try to explain)
I expect that the set of people who:
Expect to have died of old age within five years
Are willing to reduce how long they’ll expected-live in order to be richer before they die
Are willing to sacrifice all of humanity’s future (including the future of their loved-ones who aren’t expected to die of old age within five years)
Take actions who impact the what superintelligence is built
is extremely small. It’s not like Sam Altman is 70.
But it’s not public-goods kind of thing. If they knew that the choice was between:
Rich now, dead in five years
Less rich now, post-scarcely-rich and immortal and in utopia forever in five years
then pretty much nobody would still choose the former. If people realized the truth, they would choose otherwise.
In my experience, most of the selfishness people claim to have to justify continuing to destroy the world instead of helping alignment is less {because that’s their actual core values and they’re acting rationally} and more just {finding excuses to not have to think about the problem and change their minds/actions}. I talk more about this here.
To be fair, this is less a failure of {being wrong about this specific thing}, and more a failure of {being less good at rationality in general}. But it’s still mistake-theoritic moreso than conflict-theoritic.
It would be extremely small if we’d be talking about binaries/pure certainty.
If in reality, everything is uncertain, and in particular (as I think), everyone has individually a tiny probability of changing the outcome, everyone ends up free-riding.
This is true for the commoner[1] who’s using ChatGPT or whichever cheapest & fastest AI tool he finds for him to succeed in his work, therefore supporting the AI race and “Take actions who impact the what superintelligence is built”.
It may also be true for CEOs of many AI companies. Yes their distopia-probability-impact is larger, but equally so do their own career, status, power—and future position within the potential new society, see jacob_cannell’s comment—depend more strongly hinge on their action.
(Imperfect illustrative analogy: Climate change may kill a hundred million people or so, the being called human will tend to fly around the world, heating it up. Would anyone be willing to “sacrifice” hundred million people for her trip to Bali? I have some hope they wouldn’t. But, she’ll not avoid the holiday if her probability of avoiding disastrous climate change anyway is tiny. And if instead of her holiday, her entire career, fame, power depended on her to continue polluting, even if she was a global scale polluter, she’d likely enough not stop emitting for the sake of changing. I think we clearly must acknowledge this type of public good/freerider dynamics in the AI domain.
***
Agree with a lot in this, but w/o changing my interpretation much: Yes, humans are good in rationalization of their bad actions indeed. But they’re especially good at it when it’s in their egoistic interest to continue the bad thing. So both the commoner and the AI CEO alike, might well rationalize ‘for complicated reason it’s fine for the world if we (one way or another) heat up the AI race a bit’ in irrational ways—really as they might rightly see it in their own material interest to be continuing to do so, and want to make their own brain & others see them as good persons nevertheless.
Btw, I agree the situation is a bit different for commoners vs. Sam Altman & co. I read your post as being about persons in general, even people who are merely using the AI tools and therefore economically influence the domain via the market forces. If that was not only my wrong reading, then you might simplify the discussion if you edit your post to refer to those with significant probability of making a difference (I interpret your reply in that way; though I also don’t think the result changes much, as I try to explain)