The distinction amortized vs direct in humans seems related to system-1 vs system-2 in Thinking Fast and Slow.
“the implementation of powerful mesa-optimizers inside the network quite challenging”
I think it’s quite likely that we see optimizers implemented outside the network in the style of AutoGPT (people can explicitly build direct optimizers on top of amortized ones).
The distinction amortized vs direct in humans seems related to system-1 vs system-2 in Thinking Fast and Slow.
“the implementation of powerful mesa-optimizers inside the network quite challenging”
I think it’s quite likely that we see optimizers implemented outside the network in the style of AutoGPT (people can explicitly build direct optimizers on top of amortized ones).