A strong motivating aspect of the study is measuring AI R&D accleration. I am somewhat wary of using this methodology to find negative evidence for this kind of acceleration happening at labs:
I must believe that using AI agents productively is a skill question, despite the graphs in the paper showing no learning effects. One kind of company filled with people knowing lots about how to prompt AIs, and their limitations, are AI labs. Even if most developers
The mean speedup/slowdown can be a difficult metric: the heavy tail of research impact + feedback loops around AI R&D make it so that just one subgroup with high positive speedup could have a big impact.
Reading a recent account of an employee who left OpenAI, the dev experience also sounds pretty dissimilar. Summarizing, OAI repos are large (matches the study setting), but people don’t have a great understanding of the full repo (since it’s a large monorepo+ a lot of new people joining) and there do not seem to be uniform code guidelines.
Great study!
A strong motivating aspect of the study is measuring AI R&D accleration. I am somewhat wary of using this methodology to find negative evidence for this kind of acceleration happening at labs:
I must believe that using AI agents productively is a skill question, despite the graphs in the paper showing no learning effects. One kind of company filled with people knowing lots about how to prompt AIs, and their limitations, are AI labs. Even if most developers
The mean speedup/slowdown can be a difficult metric: the heavy tail of research impact + feedback loops around AI R&D make it so that just one subgroup with high positive speedup could have a big impact.
Reading a recent account of an employee who left OpenAI, the dev experience also sounds pretty dissimilar. Summarizing, OAI repos are large (matches the study setting), but people don’t have a great understanding of the full repo (since it’s a large monorepo+ a lot of new people joining) and there do not seem to be uniform code guidelines.