I don’t think the point of the detailed stories is that they strongly expect that particular thing to happen? It’s just useful to have a concrete possibility in mind.
AprilSR
I bet this is mostly a training data limitation.
Someone at Google allegedly explicitly said that there wasn’t any possible evidence which would cause them to investigate the sentience of the AI.
I don’t think human level AIs are safe, but I also think it’s pretty clear they’re not so dangerous that it’s impossible to use them without destroying the world. We can probably prevent them from being able to modify themselves, if we are sufficiently careful.
“A human level AI will recursively self improve to superintelligence if we let it” isn’t really that solid an argument here, I think.
I don’t think it is completely inconceivable that Google could make an AI which is surprisingly close to a human in a lot of ways, but it’s pretty unlikely.
But I don’t think an AI claiming to be sentient is very much evidence: it can easily do that even if it is not.
Even if it takes years, the “make another AGI to fight them” step would… require solving the alignment problem? So it would just give us some more time, and probably not nearly enough time.
We could shut off the internet/all our computers during those years. That would work fine.
So you think that, since morals are subjective, there is no reason to try to make an effort to control what happens after the singularity? I really don’t see how that follows.
I don’t understand precisely what question you’re asking. I think it’s unlikely we will happen to solve alignment by any method in the time frame between an AGI going substantially superhuman and the AGI causing doom.
Eliezer’s argument from the recent post:
The reason why nobody in this community has successfully named a ‘pivotal weak act’ where you do something weak enough with an AGI to be passively safe, but powerful enough to prevent any other AGI from destroying the world a year later—and yet also we can’t just go do that right now and need to wait on AI—is that nothing like that exists. There’s no reason why it should exist. There is not some elaborate clever reason why it exists but nobody can see it. It takes a lot of power to do something to the current world that prevents any other AGI from coming into existence; nothing which can do that is passively safe in virtue of its weakness. If you can’t solve the problem right now (which you can’t, because you’re opposed to other actors who don’t want to be solved and those actors are on roughly the same level as you) then you are resorting to some cognitive system that can do things you could not figure out how to do yourself, that you were not close to figuring out because you are not close to being able to, for example, burn all GPUs. Burning all GPUs would actually stop Facebook AI Research from destroying the world six months later; weaksauce Overton-abiding stuff about ‘improving public epistemology by setting GPT-4 loose on Twitter to provide scientifically literate arguments about everything’ will be cool but will not actually prevent Facebook AI Research from destroying the world six months later, or some eager open-source collaborative from destroying the world a year later if you manage to stop FAIR specifically. There are no pivotal weak acts.
So do you think that instead we should just be trying to not make an AGI at all?
I think it is very unlikely that they need so much time as to make it viable to solve AI Alignment by then.
Edit: Looking at the rest of the comments, it seems to me like you’re under the (false, I think) impression that people are confident a superintelligence wins instantly? Its plan will likely take time to execute. Just not any more time than necessary. Days or weeks, it’s pretty hard to say, but not years.
Even if a decisive victory is a lot harder than most suspect, I think internet access is sufficient to buy a superintelligence plenty of time to think and maneuver into a position where it can take a decisive action if it’s possible at all.
I think if we notice that the AGI went off the rails and kill the internet it might be recoverable? But it feels possible for the AGI to hide that this happened.
The obvious option in this class is to try to destroy the world in a way that doesn’t send out an AI to eat the lightcone that might possibly contain aliens who could have a better shot.
I am really not a fan of this option.
I think the number of dath ilani who think of the concept of AGI is plausibly pretty substantial, but they just go talk to a Keeper about it.
Not being able to send messages too complex for humans to understand seems to me like it’s plausibly a benefit for many of the cases where you’d want to do this.
AIs are like genies: they’ll fulfill the literal words of your wish, but don’t care at all about the spirit.
Humans are pretty clever, but AI will be eventually be even more clever. If you give a powerful enough AI a task, it can direct a level of ingenuity towards it far greater than history’s smartest scientists and inventors. But there are many cases of people accidentally giving an AI imperfect instructions.
If things go poorly, such an AI might notice that taking over the world would give it access to lots of resources helpful for accomplishing its task. If this ever happens, even once, with an AI smart enough to escape any precautions we set and succeed at taking over the world, then there will be nothing humanity can do to fix things.
i’d argue it’s a pun
Was the “glowfic excerpts” link supposed to be Self Integrity and the Drowning Child?
Can this be partially fixed by using uBlock Origin or whatever to hide certain elements of the page? I’d expect it to help at least imperfectly, not sure if you’ve tried it.