michaelcohen comments on Delegative Reinforcement Learning with a Merely Sane Advisor

michaelcohen 22 Apr 2019 0:50 UTC
LW: 1 AF: 1
0
AF
(as opposed to standard regret bounds in RL which are only applicable in the episodic setting)
??