Let’s say the solver has two input tapes: a “problem instance” input and a “noise source” input.
When Eli says “worst-case time”, he means the maximum time, over all problem instances and all noise sources, taken to solve the problem.
But when Scott says “worst-case time,” he means taking the maximum (over all problem instances X) of the average time (over all noise sources) to solve X.
Scott’s criterion seems relevant to real-world situations where one’s input is being generated by an adversarial human who knows which open-source software you’re using but doesn’t know your random seed (which was perhaps generated by moving your mouse around). Here I don’t think Eli and Scott have any real disagreement, since Eli agrees that randomness can be useful in defeating an adversarial intelligence.
BTW, I think I have a situation where randomness is useful without postulating an adversary (you could argue that the adversary is still present though).
Say we’re quicksorting a very long list of length N, your worst case is pretty bad, and your best case is not much better than average. You have a Kolmogorov distribution for your input, and you have a long string which you believe to be maximally compressed (ie, it’s really really random). If your quicksort code is short enough compared to N, then the worst-case input will have complexity << N, since it can be concisely described as the worst-case input for your sorting algorithm. If you use your random string to permute the list, then it will (probably) no longer have an easy-to-specify worst-case input, so your average case may improve.
It sounds to me like Peter is suggesting to randomly permute all inputs to the sorting algorithm, in hopes of improving the average case runtime. The justification appears to be that the worst case has low complexity because it’s easy to describe, because you can describe it by saying “it’s the worst case for ”.
For example, the worst-case input for Quicksort is an already-sorted list. You could, indeed, get better average performance out of quicksort if you could randomly permute its inputs in zero time, because code is more likely to produce already-sorted lists than lists sorted, say, by { x mod 17 }.
My initial reaction was that the implicit description “that input which is worst case for this algorithm” might specify an input which was not unusually likely to occur in practice. Peter is saying that it is more likely to occur in practice, even if it’s an implicit description—even if we don’t know what the worst-case input is—because it has lower complexity in an absolute sense, and so more processes in the real world will generate that input than one with a higher complexity.
Is that right?
And no fast deterministic permutation can be used instead, because the deterministic permutation could be specified concisely.
In practice, our random number generators use a seed of 16 to 32 bits, so they introduce almost no complexity at all by this criterion. No RNG can be useful for this purpose, for the same reason no deterministic permutation can be useful.
Still, I think this is a conclusive proof that randomness can be useful.
That last paragraph is pretty unsettling—you get penalized for not throwing a messy obfuscatory step into yourself? A Kolmogorov distribution indeed does seem to be a “present adversary”!
Let’s say the solver has two input tapes: a “problem instance” input and a “noise source” input.
When Eli says “worst-case time”, he means the maximum time, over all problem instances and all noise sources, taken to solve the problem.
But when Scott says “worst-case time,” he means taking the maximum (over all problem instances X) of the average time (over all noise sources) to solve X.
Scott’s criterion seems relevant to real-world situations where one’s input is being generated by an adversarial human who knows which open-source software you’re using but doesn’t know your random seed (which was perhaps generated by moving your mouse around). Here I don’t think Eli and Scott have any real disagreement, since Eli agrees that randomness can be useful in defeating an adversarial intelligence.
BTW, I think I have a situation where randomness is useful without postulating an adversary (you could argue that the adversary is still present though).
Say we’re quicksorting a very long list of length N, your worst case is pretty bad, and your best case is not much better than average. You have a Kolmogorov distribution for your input, and you have a long string which you believe to be maximally compressed (ie, it’s really really random). If your quicksort code is short enough compared to N, then the worst-case input will have complexity << N, since it can be concisely described as the worst-case input for your sorting algorithm. If you use your random string to permute the list, then it will (probably) no longer have an easy-to-specify worst-case input, so your average case may improve.
It sounds to me like Peter is suggesting to randomly permute all inputs to the sorting algorithm, in hopes of improving the average case runtime. The justification appears to be that the worst case has low complexity because it’s easy to describe, because you can describe it by saying “it’s the worst case for ”.
For example, the worst-case input for Quicksort is an already-sorted list. You could, indeed, get better average performance out of quicksort if you could randomly permute its inputs in zero time, because code is more likely to produce already-sorted lists than lists sorted, say, by { x mod 17 }.
My initial reaction was that the implicit description “that input which is worst case for this algorithm” might specify an input which was not unusually likely to occur in practice. Peter is saying that it is more likely to occur in practice, even if it’s an implicit description—even if we don’t know what the worst-case input is—because it has lower complexity in an absolute sense, and so more processes in the real world will generate that input than one with a higher complexity.
Is that right?
And no fast deterministic permutation can be used instead, because the deterministic permutation could be specified concisely.
In practice, our random number generators use a seed of 16 to 32 bits, so they introduce almost no complexity at all by this criterion. No RNG can be useful for this purpose, for the same reason no deterministic permutation can be useful.
Still, I think this is a conclusive proof that randomness can be useful.
That last paragraph is pretty unsettling—you get penalized for not throwing a messy obfuscatory step into yourself? A Kolmogorov distribution indeed does seem to be a “present adversary”!