Chris_Leong comments on Stuck Exploration

Chris_Leong 20 Feb 2020 8:52 UTC
2 points
Yeah, need to replace dying with losing a lot of utility. I’ve updated the post.

But coin needs to depend on your prediction instead of being always biased a particular way.
- shminux 20 Feb 2020 21:28 UTC
  2 points
  Parent
  But coin needs to depend on your prediction instead of being always biased a particular way.
  I don’t see why, where would the isomorphism break?
  - Chris_Leong 20 Feb 2020 22:08 UTC
    2 points
    Parent
    In the iterated version we want the coin to sometimes be heads, sometimes tails. Sorry, I’m confused, I have no idea why you want to transform the problem like that?
    - shminux 21 Feb 2020 2:23 UTC
      2 points
      Parent
      Was trying to explain, but it looks like I screwed something up in the reformulation :)
- Dagon 20 Feb 2020 19:18 UTC
  2 points
  Parent
  But coin needs to depend on your prediction instead of being always biased a particular way.
  Does it? I think it only depends on failure to explore/update which is a property of the (broken) agent, not an effect of the setup.
  My recommended semi-causal agent (thanks to https://www.lesswrong.com/posts/9m2fzjNSJmd3yxxKG/acdt-a-hack-y-acausal-decision-theory for the idea) does the following: Start out with X% intend to heads/pay and 1-X intend to tails/not pay, based on priors of chance (NOT 0 or 1) and value of each payout box in the matrix. Randomize and commit before start of game (so the predictor can operate), adjust per bayes’ rule and re-randomize the committment after each iteration. You’ll never get stuck at certainty, so you’ll converge on the optimal percentage for the power of the predictor and the outcomes actually available.
  This doesn’t work with computationally-infeasible ranges of action/outcome, but it seems to solve iterated (including iterated in thought-experiments) simple definable cases.