DanielFilan comments on Richard Ngo’s Shortform

DanielFilan 26 Nov 2020 17:42 UTC
LW: 2 AF: 1
AF
Well now I’m less sure that it’s incorrect. I was originally imagining that like in Solomonoff induction, the TMs basically directly controlled AIXI’s actions, but that’s not right: there’s an expectimax. And if the TMs reinforce actions by shaping the rewards, in the AIXI formalism you learn that immediately and throw out those TMs.
- Richard_Ngo 26 Nov 2020 18:00 UTC
  LW: 2 AF: 1
  AF Parent
  Oh, actually, you’re right (that you were wrong). I think I made the same mistake in my previous comment. Good catch.