Reinforcement learning does create agents, those agents just aren’t expected utility maximisers.
Claims that expected utility maximisation is the ideal or limit of agency seem wrong.
I think expected utility maximisation is probably anti-natural to generally capable optimisers.
Reinforcement learning does create agents, those agents just aren’t expected utility maximisers.
Claims that expected utility maximisation is the ideal or limit of agency seem wrong.
I think expected utility maximisation is probably anti-natural to generally capable optimisers.