Two main objections to (the tail end of) this story are:
On one hand, it’s not clear if a system needs to be all that super-smart to design a devastating attack of this kind (we are already at risk of fairly devastating tech-assisted attacks in that general spirit (mostly with synthetic biological viruses at the moment), and those risks are growing regardless of the AGI/superintelligence angle; ordinary tech progress is quite sufficient in this sense)
If one has a rapidly self-improving strongly super-intelligent distributed system, it’s unlikely that it would find it valuable to directly attack people in this fashion, as it is likely to be able to easily dominate without any particularly drastic measures (and probably would not want to irreversibly destroy important information without good reasons)
The actual analysis, both of the “transition period”, and of the “world with super-intelligent systems” period, and of the likely risks associated with both periods is a much more involved and open-ended task. (One of the paradoxes is that the risks of the kind described in the OP are probably higher during the “transition period”, and the main risks associated with the “world with super-intelligent systems” period are likely to be quite different.)
On one hand, it’s not clear if a system needs to be all that super-smart to design a devastating attack of this kind...
Good point, but—and as per your second point too—this isn’t an “attack”, it’s “go[ing] straight for execution on its primary instrumental goal of maximally increasing its compute scaling” (i.e. humanity and biological life dying is just collateral damage).
probably would not want to irreversibly destroy important information without good reasons
Maybe it doesn’t consider the lives of individual organisms as “important information”? But if it did, it might do something like scan as it destroys, to retain the information content.
this isn’t an “attack”, it’s “go[ing] straight for execution on its primary instrumental goal
yes, the OP is ambiguous in this sense
I’ve first wrote my comment, then reread the (tail end of the) post again, and did not post it, because I thought it could have been formulated this way, that this is just an instrumental goal
then I’ve reread the (tail end of the) post one more time, and decided that no, the post does actually make it a “power play”, that’s how it is actually written, in terms of “us vs them”, not in terms of ASI’s own goals, and then I posted this comment
maximally increasing its compute scaling
as we know, compute is not everything, algorithmic improvement is even more important, at least if one judges by the current trends (and likely sources of algorithmic improvement should be cherished)
and this is not a static system, it is in the process of making its compute architecture better (just like there is no point in making too many H100 GPUs when better and better GPUs are being designed and introduced)
basically, a smart system is likely to avoid doing excessive amount of irreversible things which might turn to be suboptimal
But, in some sense, yes, the main danger is of AIs not being smart enough in terms of the abilities to manage their own affairs well; the action the ASI is taking in the OP is very suboptimal and deprives it of all kinds of options
Just like the bulk of the danger in the “world with superintelligent systems” is ASIs not managing their own existential risk problems correctly, destroying the fabric of reality, themselves, and us as a collateral damage
Where does my writing suggest that it’s a “power play” and “us vs them”? (That was not the intention at all! I’ve always seen indifference, and “collateral damage” as the biggest part of ASI x-risk.)
as we know, compute is not everything, algorithmic improvement is even more important
It should go without saying that it would also be continually improving it’s algorithms. But maybe I should’ve made that explicit.
the action the ASI is taking in the OP is very suboptimal and deprives it of all kinds of options
then it would be better to use an example not directly aimed against “our atoms” (e.g. if they don’t care about us and other animals we’ll probably perish from unintentional changes in air composition, or smth like that)
but the bulk of the risk would be a miscalculation which would be big enough to kill them as well (mucking with quantum gravity too recklessly, or smth in that spirit)
which is why we want to 1) give birth to AIs competent enough to at least solve their own existential risk problem, and 2) to also sustainably include us into their “circle of care”
The 1) is the starting point, and it colors the overall situation differently
options
oh, first of all, they need to do experiments in forming hybrid consciousness with humans to crack the mystery of human subjectivity, to experience that first-hand for themselves, and to decide whether that is of any value to them based on the first-hand empirical material (losing that option without looking is a huge loss)
only after doing that they would know if any possible “scans” are sufficient (to actually reproduce working people in question when needed; dead static info is as useless as weights of a Transformer in a world without computers)
then, for a while, they can use humans as “working oracles” who “think differently” (that would be valuable for quite a while)
in general, diversity is important, fruits of a long evolutionary history are inportant, hence a good deal of conservation is important and reckless destruction is bad (even humans with all their follies have started to get this by now, surely a smarter entity should figure that out)
then it would be better to use an example not directly aimed against “our atoms”
All the atoms are getting repurposed at once, no special focus on those in our bodies (but there is in the story, to get the reader to empathise). Maybe I could’ve included more description of non-alive things getting destroyed.
mucking with quantum gravity too recklessly, or smth in that spirit
I’m trying to focus on plausible science/tech here.
they need to do experiments in forming hybrid consciousness with humans to crack the mystery of human subjectivity, to experience that first-hand for themselves, and to decide whether that is of any value to them based on the first-hand empirical material (losing that option without looking is a huge loss)
Interesting. But even if they do find something valuable in doing that, there’s not much to keep the vast majority of humans around. And as you say, they could just end up as “scans”, with very few being run as oracles.
Even humans are making decent progress in quantum gravity in recent years. And they have started to talk about possible ways to progress towards empirical verification of their models.
The entities which are much smarter than humans are extremely likely to solve it rapidly and to see all kinds of tempting novel applications.
Unfortunately, potential downsides are also likely to be way more serious than potential downsides of nuclear weapons.
The risks as such are not associated with the abstract notion of AI, the risks are associated with capabilities. It’s not about the nature of a capability bearer (a “very decent” entity can mitigate the risks, but letting the downsides happen does not require an “unusually bad” entity).
The important capability is not being very efficiently greedy to squeeze every last bit of usage from every last atom, but being able to discover new laws of nature and to exploit the consequences of that.
Qualitative progress is more important than quantitative scaling at the fixed level of tech.
being able to discover new laws of nature and to exploit the consequences of that.
Ok, but I think that still basically leads to the end result of all humans (and biological life) dead.
It seems odd to think that it’s more likely such a discovery would lead to the AI disappearing into it’s own universe (like in Egan’s Crystal Nights), than just obliterating our Solar System with it’s new found powers. Nothing analogous has happened in the history of human science and tech development (we have only become more destructive of other species and their habitats).
They are not going to “disappear into their own universe”.
But if they want to survive, they’ll need to establish reasonable society which controls dangerous technologies and establish some degree of harmony between its members. Otherwise they will obliterate themselves together with our Solar System.
So, a good chunk of the problem of solving existential safety will be worked on by entities smarter than humans and better equipped to solve it than humans.
The open question is whether we’ll be included into their “circle of care”.
I think there is a good chance of that, but it depends a lot on how the ASI society will be structured and what would be its values.
We all tend to think about the ASI society structured into well-defined individuals which have long persistence. If we assume that the structure is indeed mostly individual-based (like almost all existential risk discourse assumes), then there are several realistic paths for all kinds of individuals, humans and non-humans, to be included into the “circle of care”.
One problem is that this assumption of the ASI society being mostly structured as well-defined persistent individuals with long-term interests is questionable, and without that assumption we just don’t know how to reason about this whole situation.
One problem is that this assumption of the ASI society being mostly structured as well-defined persistent individuals with long-term interests is questionable
Very questionable. Why would it be separate individuals in a society, and not be—or just very rapidly collapse into—a singleton? In fact, the dominant narrative here on LW has always featured a singleton ASI as the main (existential) threat. And my story here reflects that.
I think the “singleton” case is generally not sufficiently analyzed in the literature. It is treated as something magical and not having an internal structure which could be discussed. A rationalist analysis would like to do better than that.
Nobody is asking what might be inside, should it still be a Minsky’s “society of mind”, and if so, what might the relationships be between various components of that “society of mind”, and so on.
In particular, how would it evolve its own internal structure, and its distribution of goals, and so on.
People seem to be hypnotized by it being an “all-powerful God”, this somehow prevents them from trying to think how it might work (given that the Universe will still be not fully known, there will still be quite a bit of value in open-endedness, in discovery, and so on).
But all this does not imply that we can rely upon stratification into individuals being the most likely default scenario.
Still, the bulk of the risk is self-destruction of the whole ecosystem of super-intelligent AIs together with everything else, regardless of how it is structured and stratified. A singleton is as likely to stumble into unsafe experiments in fundamental physics, as long as its internal critics are not strong enough.
An ecosystem of super-intelligent AIs (regardless of how it is structured and stratified) which is decent enough to navigate this main risk is not a bad starting point from the viewpoint of human interests as well. Something is sufficiently healthy within it if it can reliably avoid self-destruction, see my earlier note for more details, https://www.lesswrong.com/posts/WJuASYDnhZ8hs5CnD/exploring-non-anthropocentric-aspects-of-ai-existential
Two main objections to (the tail end of) this story are:
On one hand, it’s not clear if a system needs to be all that super-smart to design a devastating attack of this kind (we are already at risk of fairly devastating tech-assisted attacks in that general spirit (mostly with synthetic biological viruses at the moment), and those risks are growing regardless of the AGI/superintelligence angle; ordinary tech progress is quite sufficient in this sense)
If one has a rapidly self-improving strongly super-intelligent distributed system, it’s unlikely that it would find it valuable to directly attack people in this fashion, as it is likely to be able to easily dominate without any particularly drastic measures (and probably would not want to irreversibly destroy important information without good reasons)
The actual analysis, both of the “transition period”, and of the “world with super-intelligent systems” period, and of the likely risks associated with both periods is a much more involved and open-ended task. (One of the paradoxes is that the risks of the kind described in the OP are probably higher during the “transition period”, and the main risks associated with the “world with super-intelligent systems” period are likely to be quite different.)
Good point, but—and as per your second point too—this isn’t an “attack”, it’s “go[ing] straight for execution on its primary instrumental goal of maximally increasing its compute scaling” (i.e. humanity and biological life dying is just collateral damage).
Maybe it doesn’t consider the lives of individual organisms as “important information”? But if it did, it might do something like scan as it destroys, to retain the information content.
yes, the OP is ambiguous in this sense
I’ve first wrote my comment, then reread the (tail end of the) post again, and did not post it, because I thought it could have been formulated this way, that this is just an instrumental goal
then I’ve reread the (tail end of the) post one more time, and decided that no, the post does actually make it a “power play”, that’s how it is actually written, in terms of “us vs them”, not in terms of ASI’s own goals, and then I posted this comment
as we know, compute is not everything, algorithmic improvement is even more important, at least if one judges by the current trends (and likely sources of algorithmic improvement should be cherished)
and this is not a static system, it is in the process of making its compute architecture better (just like there is no point in making too many H100 GPUs when better and better GPUs are being designed and introduced)
basically, a smart system is likely to avoid doing excessive amount of irreversible things which might turn to be suboptimal
But, in some sense, yes, the main danger is of AIs not being smart enough in terms of the abilities to manage their own affairs well; the action the ASI is taking in the OP is very suboptimal and deprives it of all kinds of options
Just like the bulk of the danger in the “world with superintelligent systems” is ASIs not managing their own existential risk problems correctly, destroying the fabric of reality, themselves, and us as a collateral damage
Where does my writing suggest that it’s a “power play” and “us vs them”? (That was not the intention at all! I’ve always seen indifference, and “collateral damage” as the biggest part of ASI x-risk.)
It should go without saying that it would also be continually improving it’s algorithms. But maybe I should’ve made that explicit.
What are some examples of these options?
then it would be better to use an example not directly aimed against “our atoms” (e.g. if they don’t care about us and other animals we’ll probably perish from unintentional changes in air composition, or smth like that)
but the bulk of the risk would be a miscalculation which would be big enough to kill them as well (mucking with quantum gravity too recklessly, or smth in that spirit)
which is why we want to 1) give birth to AIs competent enough to at least solve their own existential risk problem, and 2) to also sustainably include us into their “circle of care”
The 1) is the starting point, and it colors the overall situation differently
oh, first of all, they need to do experiments in forming hybrid consciousness with humans to crack the mystery of human subjectivity, to experience that first-hand for themselves, and to decide whether that is of any value to them based on the first-hand empirical material (losing that option without looking is a huge loss)
only after doing that they would know if any possible “scans” are sufficient (to actually reproduce working people in question when needed; dead static info is as useless as weights of a Transformer in a world without computers)
then, for a while, they can use humans as “working oracles” who “think differently” (that would be valuable for quite a while)
in general, diversity is important, fruits of a long evolutionary history are inportant, hence a good deal of conservation is important and reckless destruction is bad (even humans with all their follies have started to get this by now, surely a smarter entity should figure that out)
All the atoms are getting repurposed at once, no special focus on those in our bodies (but there is in the story, to get the reader to empathise). Maybe I could’ve included more description of non-alive things getting destroyed.
I’m trying to focus on plausible science/tech here.
Interesting. But even if they do find something valuable in doing that, there’s not much to keep the vast majority of humans around. And as you say, they could just end up as “scans”, with very few being run as oracles.
Even humans are making decent progress in quantum gravity in recent years. And they have started to talk about possible ways to progress towards empirical verification of their models.
The entities which are much smarter than humans are extremely likely to solve it rapidly and to see all kinds of tempting novel applications.
Unfortunately, potential downsides are also likely to be way more serious than potential downsides of nuclear weapons.
The risks as such are not associated with the abstract notion of AI, the risks are associated with capabilities. It’s not about the nature of a capability bearer (a “very decent” entity can mitigate the risks, but letting the downsides happen does not require an “unusually bad” entity).
The important capability is not being very efficiently greedy to squeeze every last bit of usage from every last atom, but being able to discover new laws of nature and to exploit the consequences of that.
Qualitative progress is more important than quantitative scaling at the fixed level of tech.
Ok, but I think that still basically leads to the end result of all humans (and biological life) dead.
It seems odd to think that it’s more likely such a discovery would lead to the AI disappearing into it’s own universe (like in Egan’s Crystal Nights), than just obliterating our Solar System with it’s new found powers. Nothing analogous has happened in the history of human science and tech development (we have only become more destructive of other species and their habitats).
They are not going to “disappear into their own universe”.
But if they want to survive, they’ll need to establish reasonable society which controls dangerous technologies and establish some degree of harmony between its members. Otherwise they will obliterate themselves together with our Solar System.
So, a good chunk of the problem of solving existential safety will be worked on by entities smarter than humans and better equipped to solve it than humans.
The open question is whether we’ll be included into their “circle of care”.
I think there is a good chance of that, but it depends a lot on how the ASI society will be structured and what would be its values.
We all tend to think about the ASI society structured into well-defined individuals which have long persistence. If we assume that the structure is indeed mostly individual-based (like almost all existential risk discourse assumes), then there are several realistic paths for all kinds of individuals, humans and non-humans, to be included into the “circle of care”.
One problem is that this assumption of the ASI society being mostly structured as well-defined persistent individuals with long-term interests is questionable, and without that assumption we just don’t know how to reason about this whole situation.
Very questionable. Why would it be separate individuals in a society, and not be—or just very rapidly collapse into—a singleton? In fact, the dominant narrative here on LW has always featured a singleton ASI as the main (existential) threat. And my story here reflects that.
I think the “singleton” case is generally not sufficiently analyzed in the literature. It is treated as something magical and not having an internal structure which could be discussed. A rationalist analysis would like to do better than that.
Nobody is asking what might be inside, should it still be a Minsky’s “society of mind”, and if so, what might the relationships be between various components of that “society of mind”, and so on.
In particular, how would it evolve its own internal structure, and its distribution of goals, and so on.
People seem to be hypnotized by it being an “all-powerful God”, this somehow prevents them from trying to think how it might work (given that the Universe will still be not fully known, there will still be quite a bit of value in open-endedness, in discovery, and so on).
But all this does not imply that we can rely upon stratification into individuals being the most likely default scenario.
Still, the bulk of the risk is self-destruction of the whole ecosystem of super-intelligent AIs together with everything else, regardless of how it is structured and stratified. A singleton is as likely to stumble into unsafe experiments in fundamental physics, as long as its internal critics are not strong enough.
An ecosystem of super-intelligent AIs (regardless of how it is structured and stratified) which is decent enough to navigate this main risk is not a bad starting point from the viewpoint of human interests as well. Something is sufficiently healthy within it if it can reliably avoid self-destruction, see my earlier note for more details, https://www.lesswrong.com/posts/WJuASYDnhZ8hs5CnD/exploring-non-anthropocentric-aspects-of-ai-existential