What’s your basis for “well-defined tasks” vs. “realistic tasks” to have very different doubling times going forward? Is the idea that the recent acceleration seems to be specifically due to RL, and RL will be applicable to well-defined tasks but not realistic tasks?
This seems like an extremely important question, so if you have any further thoughts / intuitions / data to share, I’d be very interested.
Yes. RL will at least be more applicable to well-defined tasks. Some intuitions:
In my everyday, the gap between well-defined task ability and working with the METR codebase is growing
4 month doubling time is faster than the rate of progress in most other realistic or unrealistic domains
Recent models really like to reward hack, suggesting that RL can cause some behaviors not relevant to realistic tasks
This trend will break at some point, eg when labs get better at applying RL to realistic tasks, or when RL hits diminishing returns, but I have no idea when
What’s your basis for “well-defined tasks” vs. “realistic tasks” to have very different doubling times going forward? Is the idea that the recent acceleration seems to be specifically due to RL, and RL will be applicable to well-defined tasks but not realistic tasks?
This seems like an extremely important question, so if you have any further thoughts / intuitions / data to share, I’d be very interested.
Yes. RL will at least be more applicable to well-defined tasks. Some intuitions:
In my everyday, the gap between well-defined task ability and working with the METR codebase is growing
4 month doubling time is faster than the rate of progress in most other realistic or unrealistic domains
Recent models really like to reward hack, suggesting that RL can cause some behaviors not relevant to realistic tasks
This trend will break at some point, eg when labs get better at applying RL to realistic tasks, or when RL hits diminishing returns, but I have no idea when