> Here’s the graph of human vs. Claudes with the y-axis being step count (i.e. number of button presses, I think)
For this, Claude’s steps aren’t button pressing, they are a round of asking Claude what to do next, but Claude can do a couple button presses for each step if he decides to.
I’d guess 2.5, plenty of times it just does one button and then waits to see the results, with the longer steps being where it navigates to a spot on the screen (very common) or scrolls up/down through a menu (uncommon).
If you/I got the logs from the dev, a firm average would be easy to calculate.
> Here’s the graph of human vs. Claudes with the y-axis being step count (i.e. number of button presses, I think)
For this, Claude’s steps aren’t button pressing, they are a round of asking Claude what to do next, but Claude can do a couple button presses for each step if he decides to.
Thanks! Got a sense of how many button presses happen on average per step? That would be helpful for making an apples-to-apples comparison.
I’d guess 2.5, plenty of times it just does one button and then waits to see the results, with the longer steps being where it navigates to a spot on the screen (very common) or scrolls up/down through a menu (uncommon).
If you/I got the logs from the dev, a firm average would be easy to calculate.