since the necessary superintelligent infrastructure would only take a fraction of the resources allocated to the future of humanity.
I’m not sure about that and the surrounding argument. I find Eliezer’s analogy compelling here: When constructing a Dyson sphere around the sun, leaving just a tiny sliver of light enough for earth would correspond to a couple of dollars of the wealth of a contemporary billionaire. Yet you don’t get these couple of dollars.
(This analogy has caveats like Jeff Bezos lifting the Apollo 11 rocket motors from the ocean ground and giving them to the Smithsonian, which should be worth something to you. Alas it kinda means you don’t get to choose what you get. Maybe it is storage space for your brain scan like in AI 2027.)
Plus spelling out the Dyson sphere thing: The superintelligent infrastructure should highly likely by default get in the way of humanity’s existence at some point. At this point the AIs will have to consciously make a decision to avoid that at some cost to them. Humanity has a bad track record at doing that (not completely sure here but thinking of e.g. Meta’s effect on wellbeing of teenage girls). So why would AIs be more willing to do that?
He spells out possible reasons in the paragraph immediately following your quote: “Pretraining of LLMs on human data or weakly successful efforts at value alignment might plausibly seed a level of value alignment that’s comparable to how humans likely wouldn’t hypothetically want to let an already existing sapient octopus civilization go extinct”. If you disagree you should respond to those. Most people on LW are already aware that ASIs would need some positive motivation to preserve human existence.
I think that AI will also preserve humans for utilitarian reasons, like for a trade with possible aliens or simulation owners or even its own future versions – to demonstrate trustworthiness.
Yes and my reply to that (above) is humanity has a bad track record at that so why would AIs trained on human data be better? Think also of indigenous peoples, extinct species humans didn’t care enough about etc. The point also in the Dyson sphere parabel is not wanting something, it’s wanting something enough so that it happens.
People do devote some effort to things like preserving endangered species, things of historical significance that are no longer immediately useful, etc. If AIs devoted a similar fraction of their resources to humans that would be enough to preserve our existence.
The text you quoted is about what happens within the resources already allocated to the future of humanity (for whatever reasons), the overhead of turning those resources into an enduring good place to live, and keeping the world at large safe from humanity’s foibles, so that it doesn’t end up more costly than just those resources. Plausibly there is no meaningful spatial segregation to where the future of humanity computes (or otherwise exists), it’s just another aspect of what is happening throughout the reachable universe, within its share of compute.
a tiny sliver of light enough for earth would correspond to a couple of dollars of the wealth of a contemporary billionaire
Many issues exist in reference classes where solving all instances of them is not affordable to billionaires or governments or medieval kingdoms. And there is enough philanthropy that the analogy doesn’t by itself seem compelling, given that humanity as a whole (rather than particular groups within it or individuals) is a sufficiently salient thing in the world, and the cost of preserving it actually is quite affordable this time, especially using the cheapest possible options, which still only need a modest tax (in terms of matter/compute) to additionally get the benefits of superintelligent governance.
superintelligent infrastructure should highly likely by default get in the way of humanity’s existence at some point
Yes, the intent to preserve the future of humanity needs to crystallize soon enough that there is still something left. The cheapest option might be to digitize everyone and either upload or physically reconstruct when more convenient (because immediately ramping the industrial explosion starting on Earth is valuable for capturing the cosmic endowment that’s running away due to the accelerating expansion of the universe, so that you irrevocably lose a galaxy in expectation every few years of delay). But again, in the quoted text “superintelligent infrastructure” refers to whatever specifically keeps the future of humanity in a good shape (as well as making it harmless), rather than to the rest of the colonized cosmic endowment doing other things.
analogy doesn’t by itself seem compelling, given that humanity as a whole (rather than particular groups within it or individuals) is a sufficiently salient thing in the world
etc. because don’t you think humanity from the point of view of ASI at the ‘branch point’ of deciding its continued existence may well be on the order of importance of an individual to a billionaire?
Minimal alignment is a necessary premise, I’m not saying humanity’s salience as a philanthropic cause is universally compelling to AIs. There is a number of observations that make this case stronger: the language prior in LLMs, preference training for chatbots, first AGIs might need nothing fundamentally different from this, and AGI-driven Pause on superintelligence increases the chances that the eventual superintelligences in charge are strongly value aligned with these first AGIs. Then in addition to the premise of a minimally aligned superintelligence, there’s the essentially arbitrarily small cost of a permanently disempowered future of humanity.
So the overall argument indeed doesn’t work without humanity actually being sufficiently salient to the values of superintelligences that are likely to end up in charge, and the argument from low cost only helps up to a point.
I agree, I’m probably not as sure about sufficient alignment but yes.
I suppose this also assumes a kind of orderly world where it actually is within the means of humanity, AGIs (within their Molochian frames), and trivial means of later superintelligences to preserve humans. (US office construction spending and data center spending are about to cross https://x.com/LanceRoberts/status/1953042283709768078 .)
Yes, the future of humanity being a good place to live (within its resource constraints) follows from it being cheap for superintelligence to ensure (given that it’s decided to let it exist at all), while the constraint of permanent disempowerment (at some level significantly below all of cosmic endowment) is a result of not placing the future of humanity at the level of superintelligence’s own interests. Maybe there’s 2% for actually capturing a significant part of the cosmic endowment (the eutopia outcomes), and 20% for extinction. I’m not giving s-risks much credence, but maybe they still get 1% when broadly construed (any kind of warping in the future of humanity that’s meaningfully at odds with what humanity and even individual humans would’ve wanted to happen on reflection, given the resource constraints to work within).
I should also clarify that by “making it harmless” I simply mean the future of humanity being unable to actually do any harm in the end, perhaps through lacking direct access to the physical level of the world. The point is to avoid negative externalities for the hosting superintelligence, so that the necessary sliver of compute stays within budget. This doesn’t imply any sinister cognitive changes that make the future of humanity incapable of considering the idea or working in that direction.
I’m not sure about that and the surrounding argument. I find Eliezer’s analogy compelling here: When constructing a Dyson sphere around the sun, leaving just a tiny sliver of light enough for earth would correspond to a couple of dollars of the wealth of a contemporary billionaire. Yet you don’t get these couple of dollars.
(This analogy has caveats like Jeff Bezos lifting the Apollo 11 rocket motors from the ocean ground and giving them to the Smithsonian, which should be worth something to you. Alas it kinda means you don’t get to choose what you get. Maybe it is storage space for your brain scan like in AI 2027.)
Plus spelling out the Dyson sphere thing: The superintelligent infrastructure should highly likely by default get in the way of humanity’s existence at some point. At this point the AIs will have to consciously make a decision to avoid that at some cost to them. Humanity has a bad track record at doing that (not completely sure here but thinking of e.g. Meta’s effect on wellbeing of teenage girls). So why would AIs be more willing to do that?
He spells out possible reasons in the paragraph immediately following your quote: “Pretraining of LLMs on human data or weakly successful efforts at value alignment might plausibly seed a level of value alignment that’s comparable to how humans likely wouldn’t hypothetically want to let an already existing sapient octopus civilization go extinct”. If you disagree you should respond to those. Most people on LW are already aware that ASIs would need some positive motivation to preserve human existence.
I think that AI will also preserve humans for utilitarian reasons, like for a trade with possible aliens or simulation owners or even its own future versions – to demonstrate trustworthiness.
Yes and my reply to that (above) is humanity has a bad track record at that so why would AIs trained on human data be better? Think also of indigenous peoples, extinct species humans didn’t care enough about etc. The point also in the Dyson sphere parabel is not wanting something, it’s wanting something enough so that it happens.
OK I see, didn’t get the connection there.
People do devote some effort to things like preserving endangered species, things of historical significance that are no longer immediately useful, etc. If AIs devoted a similar fraction of their resources to humans that would be enough to preserve our existence.
Agree but again, we don’t get to choose what existence means.
The text you quoted is about what happens within the resources already allocated to the future of humanity (for whatever reasons), the overhead of turning those resources into an enduring good place to live, and keeping the world at large safe from humanity’s foibles, so that it doesn’t end up more costly than just those resources. Plausibly there is no meaningful spatial segregation to where the future of humanity computes (or otherwise exists), it’s just another aspect of what is happening throughout the reachable universe, within its share of compute.
Many issues exist in reference classes where solving all instances of them is not affordable to billionaires or governments or medieval kingdoms. And there is enough philanthropy that the analogy doesn’t by itself seem compelling, given that humanity as a whole (rather than particular groups within it or individuals) is a sufficiently salient thing in the world, and the cost of preserving it actually is quite affordable this time, especially using the cheapest possible options, which still only need a modest tax (in terms of matter/compute) to additionally get the benefits of superintelligent governance.
Yes, the intent to preserve the future of humanity needs to crystallize soon enough that there is still something left. The cheapest option might be to digitize everyone and either upload or physically reconstruct when more convenient (because immediately ramping the industrial explosion starting on Earth is valuable for capturing the cosmic endowment that’s running away due to the accelerating expansion of the universe, so that you irrevocably lose a galaxy in expectation every few years of delay). But again, in the quoted text “superintelligent infrastructure” refers to whatever specifically keeps the future of humanity in a good shape (as well as making it harmless), rather than to the rest of the colonized cosmic endowment doing other things.
Thanks for the reply, I have gripes with
etc. because don’t you think humanity from the point of view of ASI at the ‘branch point’ of deciding its continued existence may well be on the order of importance of an individual to a billionaire?
Minimal alignment is a necessary premise, I’m not saying humanity’s salience as a philanthropic cause is universally compelling to AIs. There is a number of observations that make this case stronger: the language prior in LLMs, preference training for chatbots, first AGIs might need nothing fundamentally different from this, and AGI-driven Pause on superintelligence increases the chances that the eventual superintelligences in charge are strongly value aligned with these first AGIs. Then in addition to the premise of a minimally aligned superintelligence, there’s the essentially arbitrarily small cost of a permanently disempowered future of humanity.
So the overall argument indeed doesn’t work without humanity actually being sufficiently salient to the values of superintelligences that are likely to end up in charge, and the argument from low cost only helps up to a point.
I agree, I’m probably not as sure about sufficient alignment but yes.
I suppose this also assumes a kind of orderly world where it actually is within the means of humanity, AGIs (within their Molochian frames), and trivial means of later superintelligences to preserve humans. (US office construction spending and data center spending are about to cross https://x.com/LanceRoberts/status/1953042283709768078 .)
Is this the result you expect by default? Or is this just one of many unlikely scenarios (like Hanson’s ‘The Age of Em’) that are worth considering?
Yes, the future of humanity being a good place to live (within its resource constraints) follows from it being cheap for superintelligence to ensure (given that it’s decided to let it exist at all), while the constraint of permanent disempowerment (at some level significantly below all of cosmic endowment) is a result of not placing the future of humanity at the level of superintelligence’s own interests. Maybe there’s 2% for actually capturing a significant part of the cosmic endowment (the eutopia outcomes), and 20% for extinction. I’m not giving s-risks much credence, but maybe they still get 1% when broadly construed (any kind of warping in the future of humanity that’s meaningfully at odds with what humanity and even individual humans would’ve wanted to happen on reflection, given the resource constraints to work within).
I should also clarify that by “making it harmless” I simply mean the future of humanity being unable to actually do any harm in the end, perhaps through lacking direct access to the physical level of the world. The point is to avoid negative externalities for the hosting superintelligence, so that the necessary sliver of compute stays within budget. This doesn’t imply any sinister cognitive changes that make the future of humanity incapable of considering the idea or working in that direction.