So: it seems as though the “default case” of a software company shipping an application would be that it crashes, or goes into an infinite loop—since that’s what happens unless steps are specifically taken to avoid it.
Not quite. The “default case” of a software company shipping an application is that there will definitely be bugs in the parts of the software they have not specifically and sufficiently tested… where “bugs” can mean anything from crashes or loops, to data corruption.
The analogy here—and it’s so direct and obvious a relationship that it’s a stretch to even call it an analogy! -- is that if you haven’t specifically tested your self-improving AGI for it, there are likely to be bugs in the “not killing us all” parts.
I repeat: we already know that untested scenarios nearly always have bugs, because human beings are bad at predicting what complex programs will do, outside of the specific scenarios they’ve envisioned.
And we are spectacularly bad at this, even for crap like accounting software. It is hubris verging on sheer insanity to assume that humans will be able to (by default) write a self-improving AGI that has to be bug-free from the moment it is first run.
The idea that a self-improving AGI has to be bug-free from the moment it is first run seems like part of the “syndrome” to me. Can the machine fix its own bugs? What about a “controlled ascent”? etc.
How do you plan to fix the bugs in its bug-fixing ability, before the bug-fixing ability is applied to fixing bugs in the “don’t kill everyone” routine? ;-)
More to the point, how do you know that you and the machine have the same definition of “bug”? That seems to me like the fundamental danger of self-improving AGI: if you don’t agree with it on what counts as a “bug”, then you’re screwed.
(Relevant SF example: a short story in which the AI ship—also the story’s narrator—explains how she corrected her creator’s all-too-human error: he said their goal was to reach the stars, and yet for some reason, he set their course to land on a planet. Silly human!)
What about a “controlled ascent”?
How would that be the default case, if you’re explicitly taking precautions?
It seems as though you don’t have any references for the supposed “hubris verging on sheer insanity”. Maybe people didn’t think that in the first place.
Computers regularly detect and fix bugs today—e.g. check out Eclipse.
I never claimed “controlled ascent” as being “the default case”. In fact I am here criticising “the default case” as weasel wording.
Not quite. The “default case” of a software company shipping an application is that there will definitely be bugs in the parts of the software they have not specifically and sufficiently tested… where “bugs” can mean anything from crashes or loops, to data corruption.
The analogy here—and it’s so direct and obvious a relationship that it’s a stretch to even call it an analogy! -- is that if you haven’t specifically tested your self-improving AGI for it, there are likely to be bugs in the “not killing us all” parts.
I repeat: we already know that untested scenarios nearly always have bugs, because human beings are bad at predicting what complex programs will do, outside of the specific scenarios they’ve envisioned.
And we are spectacularly bad at this, even for crap like accounting software. It is hubris verging on sheer insanity to assume that humans will be able to (by default) write a self-improving AGI that has to be bug-free from the moment it is first run.
The idea that a self-improving AGI has to be bug-free from the moment it is first run seems like part of the “syndrome” to me. Can the machine fix its own bugs? What about a “controlled ascent”? etc.
How do you plan to fix the bugs in its bug-fixing ability, before the bug-fixing ability is applied to fixing bugs in the “don’t kill everyone” routine? ;-)
More to the point, how do you know that you and the machine have the same definition of “bug”? That seems to me like the fundamental danger of self-improving AGI: if you don’t agree with it on what counts as a “bug”, then you’re screwed.
(Relevant SF example: a short story in which the AI ship—also the story’s narrator—explains how she corrected her creator’s all-too-human error: he said their goal was to reach the stars, and yet for some reason, he set their course to land on a planet. Silly human!)
How would that be the default case, if you’re explicitly taking precautions?
Controlled ascent isn’t the default case, but it certainly should be what provably friendly AI is weighed against.
It seems as though you don’t have any references for the supposed “hubris verging on sheer insanity”. Maybe people didn’t think that in the first place.
Computers regularly detect and fix bugs today—e.g. check out Eclipse.
I never claimed “controlled ascent” as being “the default case”. In fact I am here criticising “the default case” as weasel wording.
If it has a bug in its utility function, it won’t want to fix it.
If it has a bug in its bug-detection-and-fixing techniques, you can guess what happens.
So, no, you can’t rely on the AGI to fix itself, unless you’re certain that the bugs are localised in regions that will be fixed.
So: bug-free is not needed—and a controlled ascent is possible.
The unreferenced “hubris verging on sheer insanity” asumption seems like a straw man—nobody assumed that in the first place.