Time hierarchy theorems for distributional estimation problems

Warning: mostly for fun /​ basic science

##Preliminaries

Hierarchy theorems

The time hierarchy theorem is one of the simplest results in complexity theory. It says that if , then there are functions that we can compute in time that we can’t compute in time . For example, there are functions that you can compute in time that you can’t compute in time.

Hierarchy theorems are proved by diagonalization—consider the problem, “does machine halt in time at most ?” This problem can be easily solved in time. But if any machine solves this problem in time then we can get a contradiction by asking about itself.

This proof strategy is very blunt. One way to formalize its bluntness is to introduce the notion of relative complexity. Rather than considering normal computers, we consider a computer that has access to a black box computing a particular function . Hierarchy theorems hold relative to any function .

Relativization is a hallmark of “easy” complexity theoretic results (i.e. those that we can prove). We can prove very few separations that don’t relativize. (Scott Aaronson has introduced a slightly stronger notion of algebrization which more accurately captures what we can actually prove, and we can prove a few more lower bounds on low-depth circuits.)

Distributional estimation problems

A distributional estimation problem is a sequence of distributions over pairs . The goal of an estimator is to approximate given . The score of a estimator on is the expected squared error, i.e. the expectation of , for pairs drawn from . If is a probabilistic estimator, then we also take an expectation over ’s internal randomness.

(This definition is due to Vadim Kosoy.)

Let’s say that is a better estimator than on a distributional estimation problem if there is a constant and an such that for every , ‘s score on is at least higher than ‘s score (i.e., such that the lim inf of ‘s score minus ’s score is strictly positive).

Time hierarchy for distributional estimation problems

Now we can ask:

Is there a distributional estimation problem and an estimator running in time such that is a better estimator on than any estimator running in time ?

The answer is almost certainly “yes,” and there is a very natural hard problem—sample a machine which runs in time and estimate the expected value of .

Time hierarchy does not relativize for distributional estimation problems

We can construct a probabilistic oracle such that exactly the same set of distributional estimation problems can be solved in time as can be solved by any algorithm running in any amount of time.

Namely, consider the construction of reflective oracles from this paper. With this oracle in hand, for any estimator running in any amount of time, there is an estimator running in time which approximates the results of running up to error , and in particular which is not a worse predictor than .

On input , queries the reflective oracle to estimate the expected value of . It starts by comparing this expected value to , then performs a binary search to narrow down the value to an interval of length . This gives us error of , and it works regardless of how expensive is to compute.

This argument is relative to a certain probabilistic oracle. It would be more convincing if the containment failed relative to some deterministic oracle. I’m not sure if it does.

A natural candidate deterministic oracle is one which takes as input a randomized (oracle) Turing machine , a probability , an accuracy , and an auxiliary input . The oracle makes no guarantees at all about its behavior on any particular input . But if accepts with probability strictly more than in time , then the oracle guarantees that it returns on at least of the possible tuples . And conversely, if accepts with probability strictly less than , then the oracle guarantees that it returns on at most of the possible tuples .

If such an oracle exists, then time hierarchy for distributional estimation problems certainly doesn’t hold with respect to this oracle. I can’t really tell whether such an oracle exists, my guess is that it does.

I think that the existence of such a deterministic oracle is itself an interesting question.

Conclusion

Hierarchy theorems are practically the only “easy” separation results in complexity theory. But they are easy to prove for reasons that seem morally unrelated to the real reasons that they are true.

In some sense distributional estimation problems are more natural than conventional decision problems. If we can’t prove time hierarchy for these problems, it is arguably one of the most fundamental gaps in our understanding of complexity theory, even more glaring than (for example) P vs PSPACE.

Because time hierarchy doesn’t seem to relativize for distributional estimation problems, I think there is a good chance that existing techniques can’t prove it. That said, there may also be a simple argument for hierarchy that I overlooked.