The default case of FOOM is an unFriendly AI, built by researchers with shallow insights. This AI becomes able to improve itself in a haphazard way, makes various changes that are net improvements but may introduce value drift, and then gets smart enough to do guaranteed self-improvement, at which point its values freeze (forever).
...however, it is not terribly clear what being “the default case” is actually supposed to mean.
Seems plausible to interpret “default case” as meaning “the case that will most probably occur unless steps are specifically taken to avoid it”.
For example, the default case of knocking down a beehive is that you’ll get stung; you avoid that default case by specifically anticipating it and taking countermeasures (i.e. wearing a bee-keeping suit).
So: it seems as though the “default case” of a software company shipping an application would be that it crashes, or goes into an infinite loop—since that’s what happens unless steps are specifically taken to avoid it.
The term “the default case” seems to be a way of making the point without being specific enough to attract the attention of critics
So: it seems as though the “default case” of a software company shipping an application would be that it crashes, or goes into an infinite loop—since that’s what happens unless steps are specifically taken to avoid it.
Not quite. The “default case” of a software company shipping an application is that there will definitely be bugs in the parts of the software they have not specifically and sufficiently tested… where “bugs” can mean anything from crashes or loops, to data corruption.
The analogy here—and it’s so direct and obvious a relationship that it’s a stretch to even call it an analogy! -- is that if you haven’t specifically tested your self-improving AGI for it, there are likely to be bugs in the “not killing us all” parts.
I repeat: we already know that untested scenarios nearly always have bugs, because human beings are bad at predicting what complex programs will do, outside of the specific scenarios they’ve envisioned.
And we are spectacularly bad at this, even for crap like accounting software. It is hubris verging on sheer insanity to assume that humans will be able to (by default) write a self-improving AGI that has to be bug-free from the moment it is first run.
The idea that a self-improving AGI has to be bug-free from the moment it is first run seems like part of the “syndrome” to me. Can the machine fix its own bugs? What about a “controlled ascent”? etc.
How do you plan to fix the bugs in its bug-fixing ability, before the bug-fixing ability is applied to fixing bugs in the “don’t kill everyone” routine? ;-)
More to the point, how do you know that you and the machine have the same definition of “bug”? That seems to me like the fundamental danger of self-improving AGI: if you don’t agree with it on what counts as a “bug”, then you’re screwed.
(Relevant SF example: a short story in which the AI ship—also the story’s narrator—explains how she corrected her creator’s all-too-human error: he said their goal was to reach the stars, and yet for some reason, he set their course to land on a planet. Silly human!)
What about a “controlled ascent”?
How would that be the default case, if you’re explicitly taking precautions?
It seems as though you don’t have any references for the supposed “hubris verging on sheer insanity”. Maybe people didn’t think that in the first place.
Computers regularly detect and fix bugs today—e.g. check out Eclipse.
I never claimed “controlled ascent” as being “the default case”. In fact I am here criticising “the default case” as weasel wording.
I think your analogy is apt. It’s a similar argument for FAI; just as a software company should not ship a product without first running it through some basic tests to make sure it doesn’t crash, so an AI developer should not turn on their (edit: potentially-FOOMing) AI unless they’re first sure it is Friendly.
If the “default case” is that your next operating system upgrade will crash your computer or loop forever, then maybe you have something to worry about—and you should probably do an extensive backup, with this special backup software I am selling.
If the “default case” is that your next operating system upgrade will crash your computer or loop forever...
It would certainly be the default case for untested operating system upgrades. Whenever I write a program, even a small program, it usually doesn’t work the first time I run it; there’s some mistake I made and have to go back and fix. I would never ship software that I hadn’t at least ran on my own to make sure it does what it’s supposed to.
The problem with that when it comes to AI research, according to singulitarians, is that there’s no safe way to do a test run of potentially-FOOMing software; mistakes that could lead to unFriendliness have to be found in some way that doesn’t involve running the code, even in a test environment.
This idea is proposed by people with little idea of the value of testing[...]
The usefulness of testing is beside the point. The argument is that testing would be dangerous.
Also, you are now talking about performing “test runs”. Is that doing testing, now?
By “testing” I meant “running the code to see if it works”, which includes unit testing individual components, integration or functional testing on the program as a whole, or the simple measure of running the program and seeing if it does what it’s supposed to. By “doing test runs” I meant doing either of the latter two.
I would never ship, or trust in production, a program that had only been subjected to unit tests. This poses a problem for AI researchers, because while unit testing a potentially-FOOMing AI might well be safe (and would certainly be helpful in development), testing the whole thing at once would not be.
In fact, who has supposedly proposed this idea? What did they actually say?
I think EY’s the original person behind a lot of this, but now the main visible proponents seem to be SIAI. Here’s a link to the big ol’ document they wrote about FAI.
On the specific issue of having to formally prove friendliness before launching an AI, I can’t find anything specific in there at the moment. Perhaps that notion came from elsewhere? I’m not sure; but, it seems straightforward to me from the premises of the argument (AGI might FOOM, we want to make sure it FOOMs into something Friendly, we cannot risk running the AGI unless we know it will) that you’d have to have some way of showing that an AGI codebase is Friendly without running it, and the only other way I can think of would be to apply a rigorous proof.
Life is dangerous: the issue is surely whether testing is more dangerous than not testing.
It seems to me that a likely outcome of pursuing a strategy involving searching for a proof is that—while you are searching for it—some other team makes a machine intelligence that works—and suddenly whether your machine is “friendly”—or not—becomes totally irrelevant.
I think bashing testing makes no sense. People are interested in proving what they can about machines—in the hope of making them more reliable—but that is not the same as not doing testing.
The idea that we can make an intelligent machine—but are incapable of constructing a test harness capable of restraining it—seems like a fallacy to me.
Poke into these beliefs, and people will soon refer you to the AI-box experiment—which purports to explain that restrained intelligent machines can trick human gate keepers.
...but so what? You don’t imprison a super-intelligent agent—and then give the key to a single human and let them chat with the machine!
The “default case” occurs when not specifically avoided.
The company making the OS upgrade is going to do their best to avoid the computers it’s installed on crashing. In fact, they’ll probably hire quality control experts to make certain of it.
The whole point of the ‘Scary idea’ is that there should be an effective quality control for GAI, otherwise the risks are too big.
At the moment humanity has no idea on how to make an effective quality control—which would be some way to check if an arbitrary AI-in-a-box is Friendly.
Ergo, if a GAI is launched before Friendly AI problem has some solutions, it means that GAI was launched without a quality control performed. Scary. At least to me.
So: it seems as though the “default case” of a software company shipping an application would be that it crashes, or goes into an infinite loop—since that’s what happens unless steps are specifically taken to avoid it.
The default case for a lot of shipped application isn’t to do what it was designed to do, i.e. satisfy the target customer’s needs. Even when you ignore the bugs, often the target customer doesn’t understand how it works, or it’s missing a few key features, or it’s interface is clunky, or no-one actually needs it, or it’s made confusing with too many features nobody cares about, etc. - a lot of applications (and websites) suck, or at least, the first released version does.
We don’t always see that extent because the set of software we use is heavily biased towards the “actually usable” subset, for obvious reasons.
For example, see the debatetools that have been discussed here and are never used by anybody for real debate.
Yudkowsky calls it “The default case”—e.g. here:
...however, it is not terribly clear what being “the default case” is actually supposed to mean.
Seems plausible to interpret “default case” as meaning “the case that will most probably occur unless steps are specifically taken to avoid it”.
For example, the default case of knocking down a beehive is that you’ll get stung; you avoid that default case by specifically anticipating it and taking countermeasures (i.e. wearing a bee-keeping suit).
So: it seems as though the “default case” of a software company shipping an application would be that it crashes, or goes into an infinite loop—since that’s what happens unless steps are specifically taken to avoid it.
The term “the default case” seems to be a way of making the point without being specific enough to attract the attention of critics
Not quite. The “default case” of a software company shipping an application is that there will definitely be bugs in the parts of the software they have not specifically and sufficiently tested… where “bugs” can mean anything from crashes or loops, to data corruption.
The analogy here—and it’s so direct and obvious a relationship that it’s a stretch to even call it an analogy! -- is that if you haven’t specifically tested your self-improving AGI for it, there are likely to be bugs in the “not killing us all” parts.
I repeat: we already know that untested scenarios nearly always have bugs, because human beings are bad at predicting what complex programs will do, outside of the specific scenarios they’ve envisioned.
And we are spectacularly bad at this, even for crap like accounting software. It is hubris verging on sheer insanity to assume that humans will be able to (by default) write a self-improving AGI that has to be bug-free from the moment it is first run.
The idea that a self-improving AGI has to be bug-free from the moment it is first run seems like part of the “syndrome” to me. Can the machine fix its own bugs? What about a “controlled ascent”? etc.
How do you plan to fix the bugs in its bug-fixing ability, before the bug-fixing ability is applied to fixing bugs in the “don’t kill everyone” routine? ;-)
More to the point, how do you know that you and the machine have the same definition of “bug”? That seems to me like the fundamental danger of self-improving AGI: if you don’t agree with it on what counts as a “bug”, then you’re screwed.
(Relevant SF example: a short story in which the AI ship—also the story’s narrator—explains how she corrected her creator’s all-too-human error: he said their goal was to reach the stars, and yet for some reason, he set their course to land on a planet. Silly human!)
How would that be the default case, if you’re explicitly taking precautions?
Controlled ascent isn’t the default case, but it certainly should be what provably friendly AI is weighed against.
It seems as though you don’t have any references for the supposed “hubris verging on sheer insanity”. Maybe people didn’t think that in the first place.
Computers regularly detect and fix bugs today—e.g. check out Eclipse.
I never claimed “controlled ascent” as being “the default case”. In fact I am here criticising “the default case” as weasel wording.
If it has a bug in its utility function, it won’t want to fix it.
If it has a bug in its bug-detection-and-fixing techniques, you can guess what happens.
So, no, you can’t rely on the AGI to fix itself, unless you’re certain that the bugs are localised in regions that will be fixed.
So: bug-free is not needed—and a controlled ascent is possible.
The unreferenced “hubris verging on sheer insanity” asumption seems like a straw man—nobody assumed that in the first place.
I think your analogy is apt. It’s a similar argument for FAI; just as a software company should not ship a product without first running it through some basic tests to make sure it doesn’t crash, so an AI developer should not turn on their (edit: potentially-FOOMing) AI unless they’re first sure it is Friendly.
Well, I hope you see what I mean.
If the “default case” is that your next operating system upgrade will crash your computer or loop forever, then maybe you have something to worry about—and you should probably do an extensive backup, with this special backup software I am selling.
It would certainly be the default case for untested operating system upgrades. Whenever I write a program, even a small program, it usually doesn’t work the first time I run it; there’s some mistake I made and have to go back and fix. I would never ship software that I hadn’t at least ran on my own to make sure it does what it’s supposed to.
The problem with that when it comes to AI research, according to singulitarians, is that there’s no safe way to do a test run of potentially-FOOMing software; mistakes that could lead to unFriendliness have to be found in some way that doesn’t involve running the code, even in a test environment.
That just sounds crazy to me :-( Are these people actual programmers? How did they miss out on having the importance of unit tests drilled into them?
The problem is that running the AI might cause it to FOOM, and that could happen even in a test environment.
How do you get from that observation to the idea that running a complete untested program in the wild is going to be safer than not testing it at all?
No, the proposed solution is to first formally validate the program against some FAI theory before doing any test runs.
This idea is proposed by people with little idea of the value of testing—and little knowledge of the limitations of provable correctness—I presume.
In fact, who has supposedly proposed this idea? What did they actually say?
Also, you are now talking about performing “test runs”. Is that doing testing, now?
The usefulness of testing is beside the point. The argument is that testing would be dangerous.
By “testing” I meant “running the code to see if it works”, which includes unit testing individual components, integration or functional testing on the program as a whole, or the simple measure of running the program and seeing if it does what it’s supposed to. By “doing test runs” I meant doing either of the latter two.
I would never ship, or trust in production, a program that had only been subjected to unit tests. This poses a problem for AI researchers, because while unit testing a potentially-FOOMing AI might well be safe (and would certainly be helpful in development), testing the whole thing at once would not be.
I think EY’s the original person behind a lot of this, but now the main visible proponents seem to be SIAI. Here’s a link to the big ol’ document they wrote about FAI.
On the specific issue of having to formally prove friendliness before launching an AI, I can’t find anything specific in there at the moment. Perhaps that notion came from elsewhere? I’m not sure; but, it seems straightforward to me from the premises of the argument (AGI might FOOM, we want to make sure it FOOMs into something Friendly, we cannot risk running the AGI unless we know it will) that you’d have to have some way of showing that an AGI codebase is Friendly without running it, and the only other way I can think of would be to apply a rigorous proof.
Life is dangerous: the issue is surely whether testing is more dangerous than not testing.
It seems to me that a likely outcome of pursuing a strategy involving searching for a proof is that—while you are searching for it—some other team makes a machine intelligence that works—and suddenly whether your machine is “friendly”—or not—becomes totally irrelevant.
I think bashing testing makes no sense. People are interested in proving what they can about machines—in the hope of making them more reliable—but that is not the same as not doing testing.
The idea that we can make an intelligent machine—but are incapable of constructing a test harness capable of restraining it—seems like a fallacy to me.
Poke into these beliefs, and people will soon refer you to the AI-box experiment—which purports to explain that restrained intelligent machines can trick human gate keepers.
...but so what? You don’t imprison a super-intelligent agent—and then give the key to a single human and let them chat with the machine!
The “default case” occurs when not specifically avoided.
The company making the OS upgrade is going to do their best to avoid the computers it’s installed on crashing. In fact, they’ll probably hire quality control experts to make certain of it.
Why should AGI not have quality control?
It definitely should have quality control.
The whole point of the ‘Scary idea’ is that there should be an effective quality control for GAI, otherwise the risks are too big.
At the moment humanity has no idea on how to make an effective quality control—which would be some way to check if an arbitrary AI-in-a-box is Friendly.
Ergo, if a GAI is launched before Friendly AI problem has some solutions, it means that GAI was launched without a quality control performed. Scary. At least to me.
The default case for a lot of shipped application isn’t to do what it was designed to do, i.e. satisfy the target customer’s needs. Even when you ignore the bugs, often the target customer doesn’t understand how it works, or it’s missing a few key features, or it’s interface is clunky, or no-one actually needs it, or it’s made confusing with too many features nobody cares about, etc. - a lot of applications (and websites) suck, or at least, the first released version does.
We don’t always see that extent because the set of software we use is heavily biased towards the “actually usable” subset, for obvious reasons.
For example, see the debate tools that have been discussed here and are never used by anybody for real debate.