ESRogs comments on What Would I Do? Self-prediction in Simple Algorithms

ESRogs 20 Jul 2020 5:54 UTC
5 points
0
One point is that, while here we’re doing ε-exploration, there are ways to do better, such that the rate at which you explore is density zero, such that you explore less and less for large n
Was wondering about this as I was reading. Can I just have ε be $1 / 2^{n}$ ? Does that cause ε to shrink too fast s.t. that I don’t actually explore?
In general, what are the limits on how fast ε can shrink, and how does that depend on program complexity, or the number of output options?
- Charlie Steiner 20 Jul 2020 9:44 UTC
  6 points
  0
  Parent
  From a logic perspective you’d think any epsilon>0 would be enough to rule out the “conditioning on a falsehood” problem. But I second your question, because Scott makes it sound like there’s some sampling process going on that might actually need to do the thing. Which is weird—I thought the sampling part of logical inductors was about sampling polynomial-time proofs, which don’t seem like they should depend much on epsilon.
- Scott Garrabrant 21 Jul 2020 18:24 UTC
  5 points
  0
  Parent
  Having $ε$ be $1 / 2^{n}$ won’t work.
  Surprisingly, having $ε$ go to 0 at any quickly computably rate won’t work. For example, if $ε = 1 / n$ you could imagine having a logical induction built out of a collection of traders where one trader has almost all the money and says that on days of the form $A (m)$ , utility conditioned on going left is 0 (where $A$ is a fast growing function). Then, you have a second trader that forces the probability on day $A (m)$ of the statement that the agent goes left to be slightly higher that $1 / A (m)$ . Finally, you have a third trader that forces the expected utility conditioned on right to be very slightly above 0 on days of the form $A (m)$ .
  The first trader never loses money, since the condition is never met. The second trader only loses a bounded amount of money, since it is forcing the probability of a sentence that will be false to be very small. The third trader similarly only loses a bounded amount of money. The exploration clause will never trigger, and the agent will never go left on any day of the form $1 / A (m)$ .
  The issue here is that we need to not only explore infinitely often, we need to explore infinitely often on all simple subsets of days, if the probability goes to 0 slowly, you can just look at a subset of days that is sufficiently sparse.
  There are ways around this that allow us to make a logical induction agent that explores with destiny 0 (meaning that the limit as $N$ goes to infinity of the proportion of days $n < N$ that the agent explores is 0). This is done by explicitly exploring infinitely often on every quickly computable subset of days, while still having the probability of exploring go to 0.