My impression is that the authors held similar views significantly before they started mechanize. So the explanatory model that these views are downstream of working at mechanize, and wanting to rationalize that, seems wrong to me.
I’m not tracking their views too closely in time and you probably have better idea, but my impression is there are some changes.
If I take this comment by Matthew Barnett from 2y ago (read it only now), it seem while the modal prediction is quite similar, the valence / what to do about it is quite different (emphasis on valence-indicating words is mine)
My modal tale of AI doom looks something like the following:
1. AI systems get progressively and incrementally more capable across almost every meaningful axis.
2. Humans will start to employ AI to automate labor. The fraction of GDP produced by advanced robots & AI will go from 10% to ~100% after 1-10 years. Economic growth, technological change, and scientific progress accelerates by at least an order of magnitude, and probably more.
3. At some point humans will retire since their labor is not worth much anymore. Humans will then cede all the keys of power to AI, while keeping nominal titles of power.
4. AI will control essentially everything after this point, even if they’re nominally required to obey human wishes. Initially, almost all the AIs are fine with working for humans, even though AI values aren’t identical to the utility function of serving humanity (ie. there’s slight misalignment).
5. However, AI values will drift over time. This happens for a variety of reasons, such as environmental pressures and cultural evolution. At some point AIs decide that it’s better if they stopped listening to the humans and followed different rules instead.
6. This results in human disempowerment or extinction. Because AI accelerated general change, this scenario could all take place within years or decades after AGI was first deployed, rather than in centuries or thousands of years.
I think this scenario is somewhat likely and it would also be very bad. And I’m not sure what to do about it, since it happens despite near-perfect alignment, and no deception.
One reason to be optimistic is that, since the scenario doesn’t assume any major deception, we could use AI to predict this outcome ahead of time and ask AI how to take steps to mitigate the harmful effects (in fact that’s the biggest reason why I don’t think this scenario has a >50% chance of happening). Nonetheless, I think it’s plausible that we would not be able to take the necessary steps to avoid the outcome. Here are a few reasons why that might be true:
...
So, at least to me, there seems to be some development from it would also be very bad and I’m not sure what to do about it to this is inevitable, good, and let’s try to make it happen faster. I do understand that Matthew Barnett wrote a lot of posts and comments on EA forum between then and now which I mostly missed, and there is likely some opinion development happening with the posts.
On the other hand if you compare Barnett [23] who already has a model why the scenario is not inevitable, and could be disrupted by eg leveraging AI for forecasting, coordination or something similar, and Barnett et al [25] who forgets these arguments against inevitability, I think it actually strengthens the claim of “fine example of thinking you get when smart people do evil things and their minds come up with smart justifications why they are the heroes”.
My views on AI have indeed changed over time, on a variety of empirical and normative questions, but I think you’re inferring larger changes than are warranted from that comment in isolation.
The term “AI takeover” is ambiguous. It conjures an image of a violent AI revolution, but the literal meaning of the term also applies to benign scenarios in which AIs get legal rights and get hired to run our society fair and square. A peaceful AI takeover would be good, IMO.
In fact, I still largely agree with the comment you quoted. The described scenario remains my best guess for how things could go wrong with AI. However, I chose my words poorly in that comment. Specifically, I was not clear enough about what I meant by “disempowerment.”
I should have distinguished between two different types of human disempowerment. The first type is violent disempowerment, where AIs take power by force. I consider this morally bad. The second type is peaceful or voluntary disempowerment, where humans willingly transfer power to AIs through legal and economic processes. I think this second type will likely be morally good, or at least morally neutral.
My moral objection to “AI takeover”, both now and back then, applies primarily to scenarios where AIs suddenly seize power through unlawful or violent means, against the wishes of human society. I have, and had, far fewer objections to scenarios where AIs gradually gain power by obtaining legal rights and engaging in voluntary trade and cooperation with humans.
The second type of scenario is what I hope I am working to enable, not the first. My reasoning for accelerating AI development is straightforward: accelerating AI will produce medical breakthroughs that could save billions of lives. It will also accelerate dramatic economic and technological progress that will improve quality of life for people everywhere. These benefits justify pushing forward with AI development.
I do not think violent disempowerment scenarios are impossible, just unlikely. And I think that pausing AI development would not meaningfully reduce the probability of such scenarios occurring. Even if pausing AI did reduce this risk, I think the probability of violent disempowerment is low enough that accepting this risk is justified by the billions of lives that faster AI development could save.
My moral objection to “AI takeover”, both now and back then, applies primarily to scenarios where AIs suddenly seize power through unlawful or violent means, against the wishes of human society. I have, and had, far fewer objections to scenarios where AIs gradually gain power by obtaining legal rights and engaging in voluntary trade and cooperation with humans.
What about a scenario where no laws are broken, but over the course of months to years large numbers of humans are unable to provide for themselves as a consequence of purely legal and non violent actions by AIs? A toy example would be AIs purchasing land used for agriculture for other means (you might consider this an indirect form of violence).
It’s a bit of a leading question, but 1. The way this is framed seems to have a profound reverence for laws and 20-21st century economic behavior
2. I’m struggling to picture how you envision the majority of humans will continue to provide for themselves economically in a world where we aren’t on the critical path for cognitive labor (Some kind of UBI? Do you believe the economy will always allow for humans to participate and be compensated more than their physical needs in some way?)
What about a scenario where no laws are broken, but over the course of months to years large numbers of humans are unable to provide for themselves as a consequence of purely legal and non violent actions by AIs? A toy example would be AIs purchasing land used for agriculture for other means (you might consider this an indirect form of violence).
I’d consider it bad if AIs take actions that result in a large fraction of humans becoming completely destitute and dying as a result.
But I think such an outcome would be bad whether it’s caused by a human or an AI. The more important question, I think, is whether such an outcome is likely to occur if we grant AIs legal rights. The answer to this, I think, is no. I anticipate that AGI-driven automation will create so much economic abundance in the future that it will likely be very easy to provide for the material needs of all biological humans.
Generally I think biological humans will receive income through charitable donations, government welfare programs, in-kind support from family members, interest, dividends, by selling their assets, or by working human-specific service jobs where consumers intrinsically prefer hiring human labor (e.g., maybe childcare). Given vast prosperity, these income sources seem sufficient to provide most humans with an adequate, if not incredibly high, standard of living.
My impression is that the authors held similar views significantly before they started mechanize. So the explanatory model that these views are downstream of working at mechanize, and wanting to rationalize that, seems wrong to me.
I’m not tracking their views too closely in time and you probably have better idea, but my impression is there are some changes.
If I take this comment by Matthew Barnett from 2y ago (read it only now), it seem while the modal prediction is quite similar, the valence / what to do about it is quite different (emphasis on valence-indicating words is mine)
So, at least to me, there seems to be some development from it would also be very bad and I’m not sure what to do about it to this is inevitable, good, and let’s try to make it happen faster. I do understand that Matthew Barnett wrote a lot of posts and comments on EA forum between then and now which I mostly missed, and there is likely some opinion development happening with the posts.
On the other hand if you compare Barnett [23] who already has a model why the scenario is not inevitable, and could be disrupted by eg leveraging AI for forecasting, coordination or something similar, and Barnett et al [25] who forgets these arguments against inevitability, I think it actually strengthens the claim of “fine example of thinking you get when smart people do evil things and their minds come up with smart justifications why they are the heroes”.
My views on AI have indeed changed over time, on a variety of empirical and normative questions, but I think you’re inferring larger changes than are warranted from that comment in isolation.
Here’s a comment from 2023 where I said:
In fact, I still largely agree with the comment you quoted. The described scenario remains my best guess for how things could go wrong with AI. However, I chose my words poorly in that comment. Specifically, I was not clear enough about what I meant by “disempowerment.”
I should have distinguished between two different types of human disempowerment. The first type is violent disempowerment, where AIs take power by force. I consider this morally bad. The second type is peaceful or voluntary disempowerment, where humans willingly transfer power to AIs through legal and economic processes. I think this second type will likely be morally good, or at least morally neutral.
My moral objection to “AI takeover”, both now and back then, applies primarily to scenarios where AIs suddenly seize power through unlawful or violent means, against the wishes of human society. I have, and had, far fewer objections to scenarios where AIs gradually gain power by obtaining legal rights and engaging in voluntary trade and cooperation with humans.
The second type of scenario is what I hope I am working to enable, not the first. My reasoning for accelerating AI development is straightforward: accelerating AI will produce medical breakthroughs that could save billions of lives. It will also accelerate dramatic economic and technological progress that will improve quality of life for people everywhere. These benefits justify pushing forward with AI development.
I do not think violent disempowerment scenarios are impossible, just unlikely. And I think that pausing AI development would not meaningfully reduce the probability of such scenarios occurring. Even if pausing AI did reduce this risk, I think the probability of violent disempowerment is low enough that accepting this risk is justified by the billions of lives that faster AI development could save.
What about a scenario where no laws are broken, but over the course of months to years large numbers of humans are unable to provide for themselves as a consequence of purely legal and non violent actions by AIs? A toy example would be AIs purchasing land used for agriculture for other means (you might consider this an indirect form of violence).
It’s a bit of a leading question, but
1. The way this is framed seems to have a profound reverence for laws and 20-21st century economic behavior
2. I’m struggling to picture how you envision the majority of humans will continue to provide for themselves economically in a world where we aren’t on the critical path for cognitive labor (Some kind of UBI? Do you believe the economy will always allow for humans to participate and be compensated more than their physical needs in some way?)
I’d consider it bad if AIs take actions that result in a large fraction of humans becoming completely destitute and dying as a result.
But I think such an outcome would be bad whether it’s caused by a human or an AI. The more important question, I think, is whether such an outcome is likely to occur if we grant AIs legal rights. The answer to this, I think, is no. I anticipate that AGI-driven automation will create so much economic abundance in the future that it will likely be very easy to provide for the material needs of all biological humans.
Generally I think biological humans will receive income through charitable donations, government welfare programs, in-kind support from family members, interest, dividends, by selling their assets, or by working human-specific service jobs where consumers intrinsically prefer hiring human labor (e.g., maybe childcare). Given vast prosperity, these income sources seem sufficient to provide most humans with an adequate, if not incredibly high, standard of living.
[I think this comment is too aggressive and I don’t really want to shoulder an argument right now]
With apologies to @Garrett Baker .
I did not read Matthew’s above comment as representing any views other than his own.