I like this example of “works in practice but not in theory.” Would you associate “ambitious value learning vs. adequate value learning” with “works in theory vs. doesn’t work in theory but works in practice”?
One way that “almost rational” is much closer to optimal than “almost anti-anti-rational” is ye olde dot product, but a more accurate description of this case would involve dividing up the model space into basins of attraction. Different training procedures will divide up the space in different ways—this is actually sort of the reverse of a monte carlo simulation where one of the properties you might look for is ergodicity (eventually visiting all points in the space).
I like this example of “works in practice but not in theory.” Would you associate “ambitious value learning vs. adequate value learning” with “works in theory vs. doesn’t work in theory but works in practice”?
One way that “almost rational” is much closer to optimal than “almost anti-anti-rational” is ye olde dot product, but a more accurate description of this case would involve dividing up the model space into basins of attraction. Different training procedures will divide up the space in different ways—this is actually sort of the reverse of a monte carlo simulation where one of the properties you might look for is ergodicity (eventually visiting all points in the space).
Potentially. I think the main question is whether adequate value learning will work in practice.