davidad comments on You can still fetch the coffee today if you’re dead tomorrow

davidad 9 Dec 2022 21:08 UTC
1 point
−1
I disagree, for two reasons:
1. The $τ_{1} \cdot (_{1} - R_{1} - - -)$ bound on how much there is to gain from creating a time machine and improving past utility is outweighed by the $τ_{1} \cdot (_{1} - R_{1} - - -) \cdot C$ reward from $R_{2}$ for shutting down.
2. Every RL algorithm I’ve heard of implicitly bakes in an assumption that past utility is unmodifiable. I guess all bets are off with mesa-optimisers, but personally I’d bet against even mesa-optimisers in model-free RL behaving as if past utility is up for grabs.