DanielFilan comments on Richard Ngo’s Shortform

DanielFilan 25 Nov 2020 23:04 UTC
LW: 4 AF: 2
AF
I kind of think the lack of episodes makes it more realistic for many problems, but admittedly not for simulated games. Also, presumably many of the component Turing machines have reusable parameters and reinforce behaviour, altho this is hidden by the formalism. [EDIT: I retract the second sentence]
- DanielFilan 26 Nov 2020 17:23 UTC
  LW: 2 AF: 1
  AF Parent
  
  Also, presumably many of the component Turing machines have reusable parameters and reinforce behaviour, altho this is hidden by the formalism.
  
  Actually I think this is total nonsense produced by me forgetting the difference between AIXI and Solomonoff induction.
  - Richard_Ngo 26 Nov 2020 17:31 UTC
    LW: 2 AF: 1
    AF Parent
    Wait, really? I thought it made sense (although I’d contend that most people don’t think about AIXI in terms of those TMs reinforcing hypotheses, which is the point I’m making). What’s incorrect about it?
    - DanielFilan 26 Nov 2020 17:42 UTC
      LW: 2 AF: 1
      AF Parent
      Well now I’m less sure that it’s incorrect. I was originally imagining that like in Solomonoff induction, the TMs basically directly controlled AIXI’s actions, but that’s not right: there’s an expectimax. And if the TMs reinforce actions by shaping the rewards, in the AIXI formalism you learn that immediately and throw out those TMs.
      - Richard_Ngo 26 Nov 2020 18:00 UTC
        LW: 2 AF: 1
        AF Parent
        Oh, actually, you’re right (that you were wrong). I think I made the same mistake in my previous comment. Good catch.
- [ ]
  [deleted]