I’m starting to think Claude is already superhuman at this part.
This is a claim that I find hard to evaluate.
On the one hand, Claude is better than me in a bunch of ways. It knows more without having to look it up. It works faster and without getting tired. It can even work in parallel in ways I can’t. So in all those ways it’s a better coder than me.
But if I look at the individual output, it’s clear that Claude is only at best as good as a P90 human. It’s not really able to come up with clever, Carmack-or-Knuth-level solutions to problems. Heck, sometimes it can’t even come up with as good a solution as I can! What it can do, though, is just keep applying above-media coding expertise persistently to a problem when directed to do so.
The main problem I see between Claude and being superhuman is it’s lack of taste. The main way my coding sessions go off the rails is that Claude gets hung up on an idea, runs with it, and doesn’t have the judgement to realize it was a bad idea or to make itself step back and look for a better solution (because it’s notion of what’s “better” is fairly limited).
Now none of this is to dispute that what Claude can do is really, really cool, and should dramatically increase productivity, since it can do what you would have used to need a human to do, and the human would have been slower even if you paid them more than you pay Claude. So in some limited sense that’s “superhuman”, but I mostly think we should preserve “superhuman” for when Claude can do something humans cannot modulo time and cost.
I basically agree with this. When I say Claude is superhuman at coding, I mean that when Claude knows what needs to be done, it does it about as well as a human but much faster. When I say Claude isn’t superhuman at software engineering in general, it’s because sometimes it doesn’t take the right approach when an expert software engineer would.
I consider Claude’s “taste” to be pretty good, usually, but not P90 of humans with domain experience. I’d characterize his deficiencies more along the lines of a lack of ability to do long-term “steering” at a human level. This is likely related to a lack of long-term memory and hence the ability to do continual learning.
This is a claim that I find hard to evaluate.
On the one hand, Claude is better than me in a bunch of ways. It knows more without having to look it up. It works faster and without getting tired. It can even work in parallel in ways I can’t. So in all those ways it’s a better coder than me.
But if I look at the individual output, it’s clear that Claude is only at best as good as a P90 human. It’s not really able to come up with clever, Carmack-or-Knuth-level solutions to problems. Heck, sometimes it can’t even come up with as good a solution as I can! What it can do, though, is just keep applying above-media coding expertise persistently to a problem when directed to do so.
The main problem I see between Claude and being superhuman is it’s lack of taste. The main way my coding sessions go off the rails is that Claude gets hung up on an idea, runs with it, and doesn’t have the judgement to realize it was a bad idea or to make itself step back and look for a better solution (because it’s notion of what’s “better” is fairly limited).
Now none of this is to dispute that what Claude can do is really, really cool, and should dramatically increase productivity, since it can do what you would have used to need a human to do, and the human would have been slower even if you paid them more than you pay Claude. So in some limited sense that’s “superhuman”, but I mostly think we should preserve “superhuman” for when Claude can do something humans cannot modulo time and cost.
I basically agree with this. When I say Claude is superhuman at coding, I mean that when Claude knows what needs to be done, it does it about as well as a human but much faster. When I say Claude isn’t superhuman at software engineering in general, it’s because sometimes it doesn’t take the right approach when an expert software engineer would.
I consider Claude’s “taste” to be pretty good, usually, but not P90 of humans with domain experience. I’d characterize his deficiencies more along the lines of a lack of ability to do long-term “steering” at a human level. This is likely related to a lack of long-term memory and hence the ability to do continual learning.