And yes, a lot of this is going to depend on reference classes. You point to guns as incredibly safe, which they are… as long as we exclude gun suicides or homicides, of course. Are those instances of human unreliability? If we are concerned with bad outcomes, seems like they ought to count. If you grab a gun in a moment of disinhibition while drinking and commit suicide, which you wouldn’t’ve if you hadn’t had a gun, in the same way that people blocked from jumping off a bridge turn out to not simply jump somewhere else later on, you’re as dead as if a fellow hunter shot you. If a pilot decides to commit suicide by flying into a mountain (which seems to have happened yet again in March, the China Eastern Airlines crash), you’re just as dead as if the wings iced up and it crashed that way. etc
I edited the MNIST bit to clarify, but a big point here is that there are tasks where 99.9% is “pretty much 100%” and tasks where it’s really really not (eg. operating heavy machinery); and right now, most models, datasets, systems and evaluation metrics are designed around the first scenario, rather than the second.
Intentional murder seems analogous to misalignment, not error. If you count random suicides as bugs, you get a big numerator but an even bigger denominator; the overall US suicide rate is ~1:7,000 per year, and that includes lots of people who have awful chronic health problems. If you assume a 1:20,000 random suicide rate and that 40% of people can kill themselves in a minute (roughly, the US gun ownership rate), then the rate of not doing it per decision is ~20,000 * 60 * 16 * 365 * 0.4 = 1:3,000,000,000, or ~99.99999997%.
You say “yet again”, but random pilot suicides are incredibly rare! Wikipedia counts eight on commercial flights in the last fifty years, out of a billion or so total flights, and some of those cases are ambiguous and it’s not clear what happened: https://en.wikipedia.org/wiki/Suicide_by_pilot
If you assume a 1:20,000 random suicide rate and that 40% of people can kill themselves in a minute (roughly, the US gun ownership rate), then the rate of not doing it per decision is ~20,000 * 60 * 16 * 365 * 0.4 = 1:3,000,000,000, or ~99.99999997%.
IIUC, people aren’t deciding whether to kill themselves once a minute, every minute. The thought only comes up when things are really rough, and thinking about it can take hours or days. That’s probably a nitpick.
More importantly, an agent optimizing for not intentionally shooting itself in the face would probably be much more reliable at it than a human. It just has to sit still.
If you look at RL agents in simulated environments where death is possible (e.g. Atari games), the top agents outperform most human counterparts at not dying in most games. E.g. the MuZero average score in Space Invaders is several times higher than the average human baseline, which would require it die less often on average.
So when an agent is trained to not die, it can be very efficient at it.
That’s pretty much 100%. Have you looked at the hopelessly ambiguous examples https://towardsdatascience.com/going-beyond-99-mnist-handwritten-digits-recognition-cfff96337392 or the mislabeled ones https://arxiv.org/pdf/1912.05283.pdf#page=8 https://cleanlab.ai/blog/label-errors-image-datasets/ ? I’m not sure how many of the remaining errors are ones where a human would look at it and agree that it’s obviously what the label is. (This is also a problem with ImageNet these days.)
And yes, a lot of this is going to depend on reference classes. You point to guns as incredibly safe, which they are… as long as we exclude gun suicides or homicides, of course. Are those instances of human unreliability? If we are concerned with bad outcomes, seems like they ought to count. If you grab a gun in a moment of disinhibition while drinking and commit suicide, which you wouldn’t’ve if you hadn’t had a gun, in the same way that people blocked from jumping off a bridge turn out to not simply jump somewhere else later on, you’re as dead as if a fellow hunter shot you. If a pilot decides to commit suicide by flying into a mountain (which seems to have happened yet again in March, the China Eastern Airlines crash), you’re just as dead as if the wings iced up and it crashed that way. etc
I edited the MNIST bit to clarify, but a big point here is that there are tasks where 99.9% is “pretty much 100%” and tasks where it’s really really not (eg. operating heavy machinery); and right now, most models, datasets, systems and evaluation metrics are designed around the first scenario, rather than the second.
Intentional murder seems analogous to misalignment, not error. If you count random suicides as bugs, you get a big numerator but an even bigger denominator; the overall US suicide rate is ~1:7,000 per year, and that includes lots of people who have awful chronic health problems. If you assume a 1:20,000 random suicide rate and that 40% of people can kill themselves in a minute (roughly, the US gun ownership rate), then the rate of not doing it per decision is ~20,000 * 60 * 16 * 365 * 0.4 = 1:3,000,000,000, or ~99.99999997%.
You say “yet again”, but random pilot suicides are incredibly rare! Wikipedia counts eight on commercial flights in the last fifty years, out of a billion or so total flights, and some of those cases are ambiguous and it’s not clear what happened: https://en.wikipedia.org/wiki/Suicide_by_pilot
IIUC, people aren’t deciding whether to kill themselves once a minute, every minute. The thought only comes up when things are really rough, and thinking about it can take hours or days. That’s probably a nitpick.
More importantly, an agent optimizing for not intentionally shooting itself in the face would probably be much more reliable at it than a human. It just has to sit still.
If you look at RL agents in simulated environments where death is possible (e.g. Atari games), the top agents outperform most human counterparts at not dying in most games. E.g. the MuZero average score in Space Invaders is several times higher than the average human baseline, which would require it die less often on average.
So when an agent is trained to not die, it can be very efficient at it.