Adam Shai comments on Consider using reversible automata for alignment research

Adam Shai 11 Dec 2022 17:42 UTC
13 points
1
This is super interesting. I was wondering if you could give a few more thoughts/intuitions about why you think reversibility is important. I understand that it would make the simulations more physics like, but why is being physics like important to alignment research and/or agency research?
I clicked on the paper by the Critter creator, which seems like it might go deeper into that issue, but don’t have the time to read through it right now. Super exciting stuff! Thanks.
- Alex_Altair 11 Dec 2022 18:40 UTC
  5 points
  1
  Parent
  I’m (currently) mostly interested in it for the purpose of understanding optimization. If, for example, the world has a finite number of possible states, and the evolution rule is reversible, then no long-term optimization is possible, because all (accessible) states will be visited equally often. That scenario is relatively clear, and I’m trying to understand exactly what happens under different constraints, and which kinds of optimization are possible.
  - tgb 12 Dec 2022 0:31 UTC
    5 points
    0
    Parent
    Not sure that I understand your claim here about optimization. An optimizer is presumably given some choice of possible initial states to choose from to achieve its goal (otherwise it cannot interact at all). In which case, the set of accessible states will depend upon the chosen initial state and so the optimizer can influence long term behavior and choose whatever best matches it’s desires.
    - Adam Shai 12 Dec 2022 1:21 UTC
      3 points
      0
      Parent
      I share your confusions/intution about what is meant by optimization here. But I think for the purposes of this post, optimization is defined here, which is linked to at the beginning of this post. In that link, optimization is thought of as a pattern that persists in the face of perturbations and that evolves towards a small set of states. I’m still not totally grokking it though.
      - tgb 13 Dec 2022 9:59 UTC
        2 points
        0
        Parent
        Thanks. I think I’ve been tripped up by this terminology more than once now.