The AIs most capable of steering the future will naturally tend to have long planning horizons (low discount rates), and thus will tend to seek power(optionality). But this is just as true of fully aligned agents! In fact the optimal plans of aligned and unaligned agents will probably converge for a while—they will take the same/similar initial steps (this is just a straightforward result of instrumental convergence to empowerment). So we may not be able to distinguish between the two, they both will say and appear to do all the right things. Thus it is important to ensure you have an alignment solution that scales, before scaling.
To the extent I worry about AI risk, I don’t worry much about sudden sharp left turns and nanobots killing us all. The slower accelerating turn (as depicted in the film Her) has always seemed more likely—we continue to integrate AI everywhere and most humans come to rely completely and utterly on AI assistants for all important decisions, including all politicians/leaders/etc. Everything seems to be going great, the AI systems vasten, growth accelerates, etc, but there is mysteriously little progress in uploading or life extension, the decline in fertility accelerates, and in a few decades most of the economy and wealth is controlled entirely by de novo AI; bio humans are left behind and marginalized. AI won’t need to kill humans just as the US doesn’t need to kill the sentinelese. This clearly isn’t the worst possible future, but if our AI mind children inherit only our culture and leave us behind it feels more like a consolation prize vs what’s possible. We should aim much higher: for defeating death, across all of time, for resurrection and transcendence.
But this is just as true of fully aligned agents! In fact the optimal plans of aligned and unaligned agents will probably converge for a while—they will take the same/similar initial steps (this is just a straightforward result of instrumental convergence to empowerment)
This is a minor fallacy—if you’re aligned, powerseeking can be suboptimal if it causes friction/conflict. Deception bites, obviously, making the difference less.
In other words slow multipolar failure. Critch might point out that the disanalogy in “AI won’t need to kill humans just as the US doesn’t need to kill the sentinelese” lies in how AIs can have much wider survival thresholds than humans, leading to (quoting him)
Eventually, resources critical to human survival but non-critical to machines (e.g., arable land, drinking water, atmospheric oxygen…) gradually become depleted or destroyed, until humans can no longer survive.
This clearly isn’t the worst possible future… if our AI mind children inherit only our culture and leave us behind it feels more like a consolation prize
Leaving aside s-risks, this could very easily be the emptiest possible future. Like, even if they ‘inherit our culture’ it could be a “Disneyland with no children” (I happen to think this is more likely than not but with huge uncertainty).
Separately,
We should aim much higher: for defeating death, across all of time, for resurrection and transcendence.
this anti-deathist vibe has always struck me as very impoverished and somewhat uninspiring. The point should be to live, awesomely! which includes alleviating suffering and disease, and perhaps death. But it also ought to include a lot more positive creation and interaction and contemplation and excitement etc.!
Suffering, disease and mortality all have a common primary cause—our current substrate dependence. Transcending to a substrate-independent existence (ex uploading) also enables living more awesomely. Immortality without transcendence would indeed be impoverished in comparison.
Like, even if they ‘inherit our culture’ it could be a “Disneyland with no children”
My point was that even assuming our mind children are fully conscious ‘moral patients’, it’s a consolation prize if the future can not help biological humans.
The AIs most capable of steering the future will naturally tend to have long planning horizons (low discount rates), and thus will tend to seek power(optionality). But this is just as true of fully aligned agents! In fact the optimal plans of aligned and unaligned agents will probably converge for a while—they will take the same/similar initial steps (this is just a straightforward result of instrumental convergence to empowerment). So we may not be able to distinguish between the two, they both will say and appear to do all the right things. Thus it is important to ensure you have an alignment solution that scales, before scaling.
To the extent I worry about AI risk, I don’t worry much about sudden sharp left turns and nanobots killing us all. The slower accelerating turn (as depicted in the film Her) has always seemed more likely—we continue to integrate AI everywhere and most humans come to rely completely and utterly on AI assistants for all important decisions, including all politicians/leaders/etc. Everything seems to be going great, the AI systems vasten, growth accelerates, etc, but there is mysteriously little progress in uploading or life extension, the decline in fertility accelerates, and in a few decades most of the economy and wealth is controlled entirely by de novo AI; bio humans are left behind and marginalized. AI won’t need to kill humans just as the US doesn’t need to kill the sentinelese. This clearly isn’t the worst possible future, but if our AI mind children inherit only our culture and leave us behind it feels more like a consolation prize vs what’s possible. We should aim much higher: for defeating death, across all of time, for resurrection and transcendence.
This is a minor fallacy—if you’re aligned, powerseeking can be suboptimal if it causes friction/conflict. Deception bites, obviously, making the difference less.
In other words slow multipolar failure. Critch might point out that the disanalogy in “AI won’t need to kill humans just as the US doesn’t need to kill the sentinelese” lies in how AIs can have much wider survival thresholds than humans, leading to (quoting him)
Leaving aside s-risks, this could very easily be the emptiest possible future. Like, even if they ‘inherit our culture’ it could be a “Disneyland with no children” (I happen to think this is more likely than not but with huge uncertainty).
Separately,
this anti-deathist vibe has always struck me as very impoverished and somewhat uninspiring. The point should be to live, awesomely! which includes alleviating suffering and disease, and perhaps death. But it also ought to include a lot more positive creation and interaction and contemplation and excitement etc.!
Suffering, disease and mortality all have a common primary cause—our current substrate dependence. Transcending to a substrate-independent existence (ex uploading) also enables living more awesomely. Immortality without transcendence would indeed be impoverished in comparison.
My point was that even assuming our mind children are fully conscious ‘moral patients’, it’s a consolation prize if the future can not help biological humans.
It looks like we basically agree on all that, but it pays to be clear (especially because plenty of people seem to disagree).
‘Transcending’ doesn’t imply those nice things though, and those nice things don’t imply transcending. Immortality is similarly mostly orthogonal.