As a professional software developer with 20+ years of experience who has repeatedly tried to use AI coding assistants and gotten consistently poor results, I am skeptical of even your statement that, “The average over all of Anthropic for lines of merged code written by AI is much less than 90%, more like 50%.” 50% seems way too high. Or if it is then most of that code is extraneous changes that aren’t part of the core code that executes. For example, I’ve seen what I believe to be AI-generated code where 3⁄4 of the API endpoints are unused. They exist just because the AI assumes that the rest endpoint for each entity ought to have all the actions even though that didn’t make sense in this case.
I think there is a natural tendency for AI enthusiasts to overestimate the amount of useful code they are getting from AI. If we were going to make any statements about how much code was generated by AI at any organization, I think we would need much better data than I have ever seen.
50% is pretty plausible to me, my own percentage probably hit that a few months ago.
I do think there’s a true sense in which AI code line counts are propped up by “extraneous” code—comments, extra-defensive error handling, and above all, tests. Especially for small functional changes, I’ve seen many cases where the “extraneous” code has 5x the line count of the “real” code.
But I’d still argue that the “extraneous” code typically provides real, if marginal, value.
In case you haven’t seen it, this post is by a professional developer with comparable experience to you (30+ years) who gets a lot of mileage out of pair programming with Claude Code in building “a pretty typical B2B SaaS product” and credits their productivity boost to the intuitions built up by their extensive experience enabling effective steering. I’d be curious to know your guesses as to why your experience differs.
As a professional software developer with 20+ years of experience who has repeatedly tried to use AI coding assistants and gotten consistently poor results, I am skeptical of even your statement that, “The average over all of Anthropic for lines of merged code written by AI is much less than 90%, more like 50%.” 50% seems way too high. Or if it is then most of that code is extraneous changes that aren’t part of the core code that executes. For example, I’ve seen what I believe to be AI-generated code where 3⁄4 of the API endpoints are unused. They exist just because the AI assumes that the rest endpoint for each entity ought to have all the actions even though that didn’t make sense in this case.
I think there is a natural tendency for AI enthusiasts to overestimate the amount of useful code they are getting from AI. If we were going to make any statements about how much code was generated by AI at any organization, I think we would need much better data than I have ever seen.
50% is pretty plausible to me, my own percentage probably hit that a few months ago.
I do think there’s a true sense in which AI code line counts are propped up by “extraneous” code—comments, extra-defensive error handling, and above all, tests. Especially for small functional changes, I’ve seen many cases where the “extraneous” code has 5x the line count of the “real” code.
But I’d still argue that the “extraneous” code typically provides real, if marginal, value.
In case you haven’t seen it, this post is by a professional developer with comparable experience to you (30+ years) who gets a lot of mileage out of pair programming with Claude Code in building “a pretty typical B2B SaaS product” and credits their productivity boost to the intuitions built up by their extensive experience enabling effective steering. I’d be curious to know your guesses as to why your experience differs.