I really enjoyed this study. I wish it weren’t so darn expensive, because I would love to see a dozen variations of this.
I still think I’m more productive with LLMs since Claude Code + Opus 4.0 (and have reasonably strong data points), but this does push me further in the direction of using LLMs only surgically rather than for everything, and towards recommending relatively restricted LLM use at my company.
I really enjoyed this study. I wish it weren’t so darn expensive, because I would love to see a dozen variations of this.
I still think I’m more productive with LLMs since Claude Code + Opus 4.0 (and have reasonably strong data points), but this does push me further in the direction of using LLMs only surgically rather than for everything, and towards recommending relatively restricted LLM use at my company.