To look at it one more time … Scott originally said Suppose you’re given an n-bit string, and you’re promised that exactly n/4 of the bits are 1, and they’re either all in the left half of the string or all in the right half.
So we have a whole set of deterministic algorithms for solving the problem over here, and a whole set of randomized algorithms for solving the same problem. Take the best deterministic algorithm, and the best randomized algorithm.
Some people want to claim that the best randomized algorithm is “provably better”. Really? Better in what way?
Is it better in the worst case? No, in the very worst case, any algorithm (randomized or not) is going to need to look at n/4+1 bits to get the correct answer. Even worse! The randomized algorithms people were suggesting, with low average complexity, will—in the very worst case—need to look at more than n/4+1 bits, just because there are 3/4n 0 bits, and the algorithm might get very, very unlucky.
OK, so randomized algorithms are clearly not better in the worst case. What about the average case? To begin with, nobody here has done any average case analysis. But I challenge any of you to prove that every deterministic algorithm on this problem is necessarily worse, on average, that some (one?) randomized algorithm. I don’t believe that is the case.
So what do we have left? You had to invent a bizarre scenario, supposing that the environment is an adversarial superintelligence who can perfectly read all of your mind except bits designated ‘random’, as Eliezer say, in order to find a situation where the randomized algorithm is provably better. OK, that proof works, but why is that scenario at all interesting to any real-world application? The real world is never actually in that situation, so it’s highly misleading to use it as a basis for concluding that randomized algorithms are “provably better”.
No, what you need to do is argue that the pattern of queries used by a (or any?) deterministic algorithm, is more likely to be anti-correlated with where the 1 bits are in the environment’s inputs, than the pattern used by the randomized algorithm. In other words, it seems you have some priors on the environment, that the inputs are not uniformly distributed, nor chosen with any reasonable distribution, but are in fact negatively correlated with the deterministic algorithm’s choices. And the conclusion, then, would be that the average case performance of the deterministic algorithm is actually worse than the average computed assuming a uniform distribution of inputs.
Now, this does happen sometimes. If you implement a bubble sort, it’s not unreasonable to guess that the algorithm might be given a list sorted in reverse order, much more often than picking from a random distribution would suggest.
And similarly, if the n-bits algorithm starts looking at bit #1, then #2, etc. … well, it isn’t at all unreasonable to suppose that around half the inputs will have all the 1 bits in the right half of the string, so the naive algorithm will be forced to exhibit worst-case performance (n/4+1 bits examined) far more often than perhaps necessary.
But this is an argument about average case performance of a particular deterministic algorithm. Especially given some insight into what inputs the environment is likely to provide.
This has not been an argument about worst case performance of all deterministic algorithms vs. (all? any? one?) randomized algorithm.
Which is what other commenters have been incorrectly asserting from the beginning, that you can “add power” to an algorithm by “adding randomness”.
Maybe you can, maybe you can’t. (I happen to think it highly unlikely.) But they sure haven’t shown it with these examples.
To look at it one more time … Scott originally said Suppose you’re given an n-bit string, and you’re promised that exactly n/4 of the bits are 1, and they’re either all in the left half of the string or all in the right half.
So we have a whole set of deterministic algorithms for solving the problem over here, and a whole set of randomized algorithms for solving the same problem. Take the best deterministic algorithm, and the best randomized algorithm.
Some people want to claim that the best randomized algorithm is “provably better”. Really? Better in what way?
Is it better in the worst case? No, in the very worst case, any algorithm (randomized or not) is going to need to look at n/4+1 bits to get the correct answer. Even worse! The randomized algorithms people were suggesting, with low average complexity, will—in the very worst case—need to look at more than n/4+1 bits, just because there are 3/4n 0 bits, and the algorithm might get very, very unlucky.
OK, so randomized algorithms are clearly not better in the worst case. What about the average case? To begin with, nobody here has done any average case analysis. But I challenge any of you to prove that every deterministic algorithm on this problem is necessarily worse, on average, that some (one?) randomized algorithm. I don’t believe that is the case.
So what do we have left? You had to invent a bizarre scenario, supposing that the environment is an adversarial superintelligence who can perfectly read all of your mind except bits designated ‘random’, as Eliezer say, in order to find a situation where the randomized algorithm is provably better. OK, that proof works, but why is that scenario at all interesting to any real-world application? The real world is never actually in that situation, so it’s highly misleading to use it as a basis for concluding that randomized algorithms are “provably better”.
No, what you need to do is argue that the pattern of queries used by a (or any?) deterministic algorithm, is more likely to be anti-correlated with where the 1 bits are in the environment’s inputs, than the pattern used by the randomized algorithm. In other words, it seems you have some priors on the environment, that the inputs are not uniformly distributed, nor chosen with any reasonable distribution, but are in fact negatively correlated with the deterministic algorithm’s choices. And the conclusion, then, would be that the average case performance of the deterministic algorithm is actually worse than the average computed assuming a uniform distribution of inputs.
Now, this does happen sometimes. If you implement a bubble sort, it’s not unreasonable to guess that the algorithm might be given a list sorted in reverse order, much more often than picking from a random distribution would suggest.
And similarly, if the n-bits algorithm starts looking at bit #1, then #2, etc. … well, it isn’t at all unreasonable to suppose that around half the inputs will have all the 1 bits in the right half of the string, so the naive algorithm will be forced to exhibit worst-case performance (n/4+1 bits examined) far more often than perhaps necessary.
But this is an argument about average case performance of a particular deterministic algorithm. Especially given some insight into what inputs the environment is likely to provide.
This has not been an argument about worst case performance of all deterministic algorithms vs. (all? any? one?) randomized algorithm.
Which is what other commenters have been incorrectly asserting from the beginning, that you can “add power” to an algorithm by “adding randomness”.
Maybe you can, maybe you can’t. (I happen to think it highly unlikely.) But they sure haven’t shown it with these examples.