Thanks for updating! I’m having trouble making the leap from coding task performance to actual software engineering. The latter involves a lot more than doing individual tasks and I think trying to estimate when models will be able to self recursively improve based on the former will be flawed. Do you think that there’s a certain target on SWE Bench and METR’s coding time horizon that when hit will be approx. when AI can completely take over software engineering?
Thanks for updating! I’m having trouble making the leap from coding task performance to actual software engineering. The latter involves a lot more than doing individual tasks and I think trying to estimate when models will be able to self recursively improve based on the former will be flawed. Do you think that there’s a certain target on SWE Bench and METR’s coding time horizon that when hit will be approx. when AI can completely take over software engineering?