Some details that might be relevant (in that I can imagine you’d get different results if the answers changed):
What language(s) is your codebase in?
If it’s an optionally-typed language, how much do you use types?
How big is it?
How familiar are you with it?
How big are the changes you typically have to make, across how many files?
How thorough are the existing tests?
How much of what you do is fixing bugs versus implementing features versus refactoring versus other?
Something like… how interesting is the code you write / how much boilerplate is there? Like I expect different amounts of help from the agent if a task looks like
Copy the 150 lines across five files that were used to implement this other feature, and change a few parameters.
Integrate with this new external API.
Improve performance of serializing and deserializing this data structure, when it’s streamed over a slow connection.
Change the underlying storage of this particular type from a search tree to a hashmap (most places we use the type won’t be affected, but a few places where we access internals need to be updated).
(Or if you think some of these aren’t relevant, I’m interested to hear that too.)
Some details that might be relevant (in that I can imagine you’d get different results if the answers changed):
What language(s) is your codebase in?
If it’s an optionally-typed language, how much do you use types?
How big is it?
How familiar are you with it?
How big are the changes you typically have to make, across how many files?
How thorough are the existing tests?
How much of what you do is fixing bugs versus implementing features versus refactoring versus other?
Something like… how interesting is the code you write / how much boilerplate is there? Like I expect different amounts of help from the agent if a task looks like
Copy the 150 lines across five files that were used to implement this other feature, and change a few parameters.
Integrate with this new external API.
Improve performance of serializing and deserializing this data structure, when it’s streamed over a slow connection.
Change the underlying storage of this particular type from a search tree to a hashmap (most places we use the type won’t be affected, but a few places where we access internals need to be updated).
(Or if you think some of these aren’t relevant, I’m interested to hear that too.)