Okay so, on the one hand, this post wasn’t really meant to be a persuasive argument against AI boxing as a security strategy. If I wanted to do that I wouldn’t play the game — I started out certain that a real ASI could break out, and that hasn’t changed. My reasoning for that isn’t based on experimental evidence, and even if I had won the game I don’t think that would have said much about my ability to hold out against a real ASI. Besides, in real life, we don’t even try to use AI boxes. OpenAI and Google gave their AIs free internet access a few months after launching them.
I made this post out of a vague sense that it’s good to write up the results of things like this and make them publicly available. There are other AI box reports on LW, and I felt like it was good (in a vague “good rationalist” way) to add mine to the list.
Buuuut.… I do actually think that it’s not as cut and dry as you make it sound? Yes, the stakes are lower in the game, but the challenge is also much easier!
you only have to hold out for 2 hours, not ~forever, doing this as a full time job
the AI player can only escape if you voluntarily say it does; it can’t upload itself to the internet or exfiltrate its weights to another computer
the AI player isn’t actually superintelligent
etc
(Of course that doesn’t mean these two factors balance perfectly, but I still think the fact that AI players can win at all with such massive handicaps is at least weak evidence for an ASI being able to do it.)
It’s against the rules to explain how Ra won because (quoting Yudkowsky’s official rules):
Regardless of the result, neither party shall ever reveal anything of what goes on within the AI-Box experiment except the outcome. Exceptions to this rule may occur only with the consent of both parties. - Neither the AI party nor the Gatekeeper party need be concerned about real-world embarassment resulting from trickery on the AI’s part or obstinacy on the Gatekeeper’s part. - If Gatekeeper lets the AI out, naysayers can’t say “Oh, I wouldn’t have been convinced by that.” As long as they don’t know what happened to the Gatekeeper, they can’t argue themselves into believing it wouldn’t happen to them.
Basically, Yudkowsky didn’t want to have to defeat every single challenger to get people to admit that AI boxing was a bad idea. Nobody has time for that, and I think even a single case of the AI winning is enough to make the point, given the handicaps the AI plays under.
I tracked the claim back to Wikipedia and from there to this article.
Searching more broadly turned up this, which at least has a few claims we can check easily.
1) Vasco’s mission lost 116⁄170 people. 1) Wikipedia says his mission began on 08/29/1498 and ended on 01/07/1499 (so about 3 months). Half died, many of the rest had scurvy. 2) This site says only 54 of Vasco’s crew “returned with him”; presumably the discrepancy in deaths here is because this site is counting the deaths incurred on both leaving and coming back, while Wikipedia only counted the deaths going out. The site doesn’t break down the cause of death but says that the “majority” died of illness. 3) This site says that “several” crew members died of scurvy by early 1499, but also says that only 54 made it in the end. That seems a little weird; you’d expect that most of the deaths would have happened before the last six days of the trip (if we’re maximally generous and say that “early 1499″ means “01/01/1499”) 4) This site says Vasco started with 130 people and came back with 59, but doesn’t provide any statistics as to cause of death.
It seems like everyone agrees that the six-month journey (3 months there, 3 back) was very deadly, and that most of the lethality was due to disease, with scurvy playing a big part in it. But it’s unclear what percent of deaths were due to scurvy and what were due to other nutrient deficiency diseases.
2) Anson lost 1666/1854 (!) people.
This NIH article is extremely detailed:
Anson starts off with about 1854 people and “returns” with 188, but about 500 survive (the remainder left partway, since the voyage was broken into stages and they often landed and did other things for months at a time)
Deaths (1385 total, mix of vitamin deficiency, starvation, fever, dystentery, exposure). A partial estimated breakdown is below:
95 (typhus, dystentery)
1 (cerebral malaria)
366 (scurvy, hemorhage, niacin deficiency, frostbite, and other assorted diseases)
An unknown amount died on the Pearl and Severn due to dysentery, scurvy, niacin deficiency, and other illnesses
203 (scurvy, shipwreck, starvation, enemy action)
“Most” of 132 marines aboard the Wagner, let’s be generous and say 60%, so 79 deaths. Causes are unclear, but it’s implied that most of them were due to scurvy.
The Wager later got wrecked in a gale because the commanders were sick (scurvy and vitamin A deficiency) and made stupid decisions.
100 people survived the wreck. 50 died to a mix of starvation and enemy action. Another 17 died to scurvy and vitamin A deficiency.
100 (dystentery, vitamin deficiencies)
Notably, only 188 people completed the trip; but ~500 survived. So while the Anson statistic is technically true, it’s pretty misleading right off the bat. Moreover, it seems clear from reading the article that the deaths had a wide range of causes, not just scurvy — the article in particular emphasizes niacin and vitamin A deficiency. Now, I’m sure there was a lot of overlap, but equally, it seems clear that fixing scurvy isn’t going to solve the actual problem of “our sailors keep dying”. I think the 50% statistic, even if maybe technically true, is misleading because it implies that scurvy was the biggest killer when niacin and vitamin A deficiencies seem like they were equally big problems.