My biggest question is “did the participants get to multitask?”
The paper suggests “yes”:
Developers then work on their assigned issues in their preferred order—they are allowed to flexibly complete their work as they normally would, and sometimes work on multiple issues at a time.
[...]
Developers self-report the length of time they spend working on each issue before and after PR review.
...but doesn’t really go into detail about how to navigate the issues I’d expect to run into there.
The way I had previously believed I got the biggest speedups from AI on more developed codebases make heavy use of the paradigm “send one agent to take a stab at a task that I think an agent can probably handle, then go focus on a different task in a different copy/branch of my repo” (In some cases when all the tasks are pretty doable-by-ai I’ll be more like in a “micromanager” role, rotating between 3 agents working on 3 different tasks. In cases where there’s a task that requires basically my full attention I’ll usually have one major task, but periodically notice small side-issues that seem like an LLM can handle them).
It seems like that sort of workflow is technically allowed by this paper’s process but not super encouraged. (Theoretically you record your screeen the whole time and “sort out afterwards” when you were working on various projects and when you were zoned out.)
I still wouldn’t be too surprised if I were slowed down on net because I don’t actually really stay proactively focused on the above workflow and instead zone out slightly, or spend a lot of time trying to get AIs to do some work that they aren’t actually good at yet, or take longer to review the result.
My biggest question is “did the participants get to multitask?”
The paper suggests “yes”:
...but doesn’t really go into detail about how to navigate the issues I’d expect to run into there.
The way I had previously believed I got the biggest speedups from AI on more developed codebases make heavy use of the paradigm “send one agent to take a stab at a task that I think an agent can probably handle, then go focus on a different task in a different copy/branch of my repo” (In some cases when all the tasks are pretty doable-by-ai I’ll be more like in a “micromanager” role, rotating between 3 agents working on 3 different tasks. In cases where there’s a task that requires basically my full attention I’ll usually have one major task, but periodically notice small side-issues that seem like an LLM can handle them).
It seems like that sort of workflow is technically allowed by this paper’s process but not super encouraged. (Theoretically you record your screeen the whole time and “sort out afterwards” when you were working on various projects and when you were zoned out.)
I still wouldn’t be too surprised if I were slowed down on net because I don’t actually really stay proactively focused on the above workflow and instead zone out slightly, or spend a lot of time trying to get AIs to do some work that they aren’t actually good at yet, or take longer to review the result.