Stuart Anderson answers How is reinforcement learning possible in non-sentient agents?