Good catch, but I bet MagnetoHydroDynamics would not lose the bet “after spending an hour or so, at [their] mental peak condition, proving it correct”.
True. It’s not a perfect illustration.
But it’s a good, ironic example of how the point that MagnetoHydroDynamics was trying to make can have a pretty serious flaw.
Meh. It was a special case he already knew to call out for special treatment. Would have been caught VERY quickly in testing, just like the same sort of error occurring in the general case (hmm. the primes are 4, 6, 8, 9, 10, 12? nooo...)
99.99% confidence implies that he could write TEN THOUSAND programs of similar difficulty (with hours spent verifying each at peak mental condition) and make only ONE mistake.
As someone who’s done quite a bit of programming and reading about programming, I’d be impressed if he made it passed a hundred without making two mistakes.
The industry average is 15-50 errors / 1000 lines of code. The amount of effort required to get below 0.1 errors/kLoC, like they do for the space shuttle, is very, very large. One person checking their own work for an hour doesn’t stand a chance.
That’s what I meant, though I can see that it’s not clear.
In this case a mistake would be writing the code, iterating and testing until you were satisfied, pronouncing it done, and then afterwards catching the error.
Hmm. I can definitely buy that a program of more complexity than this—less easily checked—would have that accuracy rate.
But prime checking is super simple to write and super simple to check. The only way you’d get an error through the obvious testing scheme given is to skip the testing.
You’re taking 100 bits of testing (which contain around 80 bits of information if not produced by means of the actual pattern) and treating them as around 13 bits of reliability.
My experience with coding is that stupid obvious mistakes are way more likely than 1/10000. You write something slightly wrong, keep reading it as if it were right, and that’s that.
Determining if a number is prime is a bit of a nice case, I suppose, because it’s so amenable to testing. The structure of the mistakes you make is unlikely to match the structure of primes, so you’ll catch any mistakes more easily.
I’d still consider doing it 10000 times to be extremely difficult. Just adding 10000 six-digit numbers by hand, even with some cross-checking, is quite difficult.
Overall the space shuttle software is definitely more complicated than confirming a small prime, but on a line by line basis I don’t know. The space shuttle is more of a special case example; not something you compare against.
I’d use the industry average rate until you showed me you could hit 0.1 errors/kloc for a few years. For example, Microsoft hits 10-20 defects/kloc and they put significant effort into testing.
The most likely reason that your bug rate on these programs would be anomalously low is the fact that they’re small. Instead of writing one hundred thousand line program, you’re writing 10000 ten line programs. The complexities won’t compound as much.
The probability of mistake in a program is not the same as the probability of being wrong about 1159 being a composite. At high reliability level, you factor the number, then you multiply up the factors, and check that as well.
And… You’d lose the bet. That program does have a mistake in it. See if you can find it.
(Answer: Gjb vf cevzr, ohg lbhe shapgvba pnyyf vg pbzcbfvgr.)
Good catch, but I bet MagnetoHydroDynamics would not lose the bet “after spending an hour or so, at [their] mental peak condition, proving it correct”.
True. It’s not a perfect illustration. But it’s a good, ironic example of how the point that MagnetoHydroDynamics was trying to make can have a pretty serious flaw.
Meh. It was a special case he already knew to call out for special treatment. Would have been caught VERY quickly in testing, just like the same sort of error occurring in the general case (hmm. the primes are 4, 6, 8, 9, 10, 12? nooo...)
99.99% confidence implies that he could write TEN THOUSAND programs of similar difficulty (with hours spent verifying each at peak mental condition) and make only ONE mistake.
As someone who’s done quite a bit of programming and reading about programming, I’d be impressed if he made it passed a hundred without making two mistakes.
The industry average is 15-50 errors / 1000 lines of code. The amount of effort required to get below 0.1 errors/kLoC, like they do for the space shuttle, is very, very large. One person checking their own work for an hour doesn’t stand a chance.
Only make one mistake that makes it past testing.
That’s what I meant, though I can see that it’s not clear.
In this case a mistake would be writing the code, iterating and testing until you were satisfied, pronouncing it done, and then afterwards catching the error.
Hmm. I can definitely buy that a program of more complexity than this—less easily checked—would have that accuracy rate.
But prime checking is super simple to write and super simple to check. The only way you’d get an error through the obvious testing scheme given is to skip the testing.
You’re taking 100 bits of testing (which contain around 80 bits of information if not produced by means of the actual pattern) and treating them as around 13 bits of reliability.
My experience with coding is that stupid obvious mistakes are way more likely than 1/10000. You write something slightly wrong, keep reading it as if it were right, and that’s that.
Determining if a number is prime is a bit of a nice case, I suppose, because it’s so amenable to testing. The structure of the mistakes you make is unlikely to match the structure of primes, so you’ll catch any mistakes more easily.
I’d still consider doing it 10000 times to be extremely difficult. Just adding 10000 six-digit numbers by hand, even with some cross-checking, is quite difficult.
Yes, stupid coding mistakes are more like 1 in 2 than 1 in 10^4; it is the testing that helps here.
Are the programs written for the space shuttle the same level of difficulty as this one?
Overall the space shuttle software is definitely more complicated than confirming a small prime, but on a line by line basis I don’t know. The space shuttle is more of a special case example; not something you compare against.
I’d use the industry average rate until you showed me you could hit 0.1 errors/kloc for a few years. For example, Microsoft hits 10-20 defects/kloc and they put significant effort into testing.
The most likely reason that your bug rate on these programs would be anomalously low is the fact that they’re small. Instead of writing one hundred thousand line program, you’re writing 10000 ten line programs. The complexities won’t compound as much.
The probability of mistake in a program is not the same as the probability of being wrong about 1159 being a composite. At high reliability level, you factor the number, then you multiply up the factors, and check that as well.