These are incredibly small peanuts compared to AGI omnicide.
The jailbreakability and other alignment failures of current AI systems are also incredibly small peanuts compared to AGI omnicide. Yet they’re still informative. Small-scale failures give us data about possible large-scale failures.
You’re somehow leaving out all the people who are smarter than those people, and who were great for the people around them and humanity? You’ve got like 99% actually alignment or something
Are you thinking of people such as Sam Altman, Demis Hassabis, Elon Musk, and Dario Amodei? If humans are 99% aligned, how is it that we ended up in a situation where major lab leaders look so unaligned? MIRI and friends had a fair amount of influence to shape this situation and align lab leaders, yet they appear to have failed by their own lights. Why?
When it comes to AI alignment, everyone on this site understands that if a “boxed” AI acts nice, that’s not a strong signal of actual friendliness. The true test of an AI’s alignment is what it does when it has lots of power and little accountability.
Maybe something similar is going on for humans. We’re nice when we’re powerless, because we have to be. But giving humans lots of power with little accountability doesn’t tend to go well.
Looking around you, you mostly see nice humans. That could be because humans are inherently nice. It could also be because most of the people around you haven’t been given lots of power with little accountability.
Dramatic genetic enhancement could give enhanced humans lots of power with little accountability, relative to the rest of us.
[Note also, the humans you see while looking around are strongly selected for, which becomes quite relevant if the enhancement technology is widespread. How do you think you’d feel about humanity if you lived in Ukraine right now?]
Which, yes, we should think about this, and prepare and plan and prevent, but it’s just a totally totally different calculus from AGI.
I want to see actual, detailed calculations of p(doom) from supersmart humans vs supersmart AI, conditional on each technology being developed. Before charging ahead on this, I want a superforecaster-type person to sit down, spend a few hours, generate some probability estimates, publish a post, and request that others red-team their work. I don’t feel like that is a lot to ask.
Small-scale failures give us data about possible large-scale failures.
But you don’t go from a 160 IQ person with a lot of disagreeability and ambition, who ends up being a big commercial player or whatnot, to 195 IQ and suddenly get someone who just sits in their room for a decade and then speaks gibberish into a youtube livestream and everyone dies, or whatever. The large-scale failures aren’t feasible for humans acting alone. For humans acting very much not alone, like big AGI research companies, yeah that’s clearly a big problem. But I don’t think the problem is about any of the people you listed having too much brainpower.
(I feel we’re somewhat talking past each other, but I appreciate the conversation and still want to get where you’re coming from.)
For humans acting very much not alone, like big AGI research companies, yeah that’s clearly a big problem.
How about a group of superbabies that find and befriend each other? Then they’re no longer acting alone.
I don’t think the problem is about any of the people you listed having too much brainpower.
I don’t think problems caused by superbabies would look distinctively like “having too much brainpower”. They would look more like the ordinary problems humans have with each other. Brainpower would be a force multiplier.
(I feel we’re somewhat talking past each other, but I appreciate the conversation and still want to get where you’re coming from.)
Thanks. I mostly just want people to pay attention to this problem. I don’t feel like I have unique insight. I’ll probably stop commenting soon, since I think I’m hitting the point of diminishing returns.
I mostly just want people to pay attention to this problem.
Ok. To be clear, I strongly agree with this. I think I’ve been responding to a claim (maybe explicit, or maybe implicit / imagined by me) from you like: “There’s this risk, and therefore we should not do this.”. Where I want to disagree with the implication, not the antecedent. (I hope to more gracefully agree with things like this. Also someone should make a LW post with a really catchy term for this implication / antecedent discourse thing, or link me the one that’s already been written.)
But I do strongly disagree with the conclusion ”...we should not do this”, to the point where I say “We should basically do this as fast as possible, within the bounds of safety and sanity.”. The benefits are large, the risks look not that bad and largely ameliorable, and in particular the need regarding existential risk is great and urgent.
That said, more analysis is definitely needed. Though in defense of the pro-germline engineering position, there’s few resources, and everyone has a different objection.
The jailbreakability and other alignment failures of current AI systems are also incredibly small peanuts compared to AGI omnicide. Yet they’re still informative. Small-scale failures give us data about possible large-scale failures.
Are you thinking of people such as Sam Altman, Demis Hassabis, Elon Musk, and Dario Amodei? If humans are 99% aligned, how is it that we ended up in a situation where major lab leaders look so unaligned? MIRI and friends had a fair amount of influence to shape this situation and align lab leaders, yet they appear to have failed by their own lights. Why?
When it comes to AI alignment, everyone on this site understands that if a “boxed” AI acts nice, that’s not a strong signal of actual friendliness. The true test of an AI’s alignment is what it does when it has lots of power and little accountability.
Maybe something similar is going on for humans. We’re nice when we’re powerless, because we have to be. But giving humans lots of power with little accountability doesn’t tend to go well.
Looking around you, you mostly see nice humans. That could be because humans are inherently nice. It could also be because most of the people around you haven’t been given lots of power with little accountability.
Dramatic genetic enhancement could give enhanced humans lots of power with little accountability, relative to the rest of us.
[Note also, the humans you see while looking around are strongly selected for, which becomes quite relevant if the enhancement technology is widespread. How do you think you’d feel about humanity if you lived in Ukraine right now?]
I want to see actual, detailed calculations of p(doom) from supersmart humans vs supersmart AI, conditional on each technology being developed. Before charging ahead on this, I want a superforecaster-type person to sit down, spend a few hours, generate some probability estimates, publish a post, and request that others red-team their work. I don’t feel like that is a lot to ask.
But you don’t go from a 160 IQ person with a lot of disagreeability and ambition, who ends up being a big commercial player or whatnot, to 195 IQ and suddenly get someone who just sits in their room for a decade and then speaks gibberish into a youtube livestream and everyone dies, or whatever. The large-scale failures aren’t feasible for humans acting alone. For humans acting very much not alone, like big AGI research companies, yeah that’s clearly a big problem. But I don’t think the problem is about any of the people you listed having too much brainpower.
(I feel we’re somewhat talking past each other, but I appreciate the conversation and still want to get where you’re coming from.)
How about a group of superbabies that find and befriend each other? Then they’re no longer acting alone.
I don’t think problems caused by superbabies would look distinctively like “having too much brainpower”. They would look more like the ordinary problems humans have with each other. Brainpower would be a force multiplier.
Thanks. I mostly just want people to pay attention to this problem. I don’t feel like I have unique insight. I’ll probably stop commenting soon, since I think I’m hitting the point of diminishing returns.
Ok. To be clear, I strongly agree with this. I think I’ve been responding to a claim (maybe explicit, or maybe implicit / imagined by me) from you like: “There’s this risk, and therefore we should not do this.”. Where I want to disagree with the implication, not the antecedent. (I hope to more gracefully agree with things like this. Also someone should make a LW post with a really catchy term for this implication / antecedent discourse thing, or link me the one that’s already been written.)
But I do strongly disagree with the conclusion ”...we should not do this”, to the point where I say “We should basically do this as fast as possible, within the bounds of safety and sanity.”. The benefits are large, the risks look not that bad and largely ameliorable, and in particular the need regarding existential risk is great and urgent.
That said, more analysis is definitely needed. Though in defense of the pro-germline engineering position, there’s few resources, and everyone has a different objection.