Opus 4.6 has been notably better at correcting me. Today, there were two instances where it pushed back on my plan, and proposed better alternatives both times. That was very rare with Opus 4.5 (maybe once a week?) and nonexistent before 4.5
I move data around and crunch numbers at a quant hedge fund. There are some aspects that make our work somewhat resistant to LLMs normally: we use a niche language (Julia) and a custom framework. Typically, when writing framework related code, I’ve given Claude Code very specific instructions and it’s followed them to the letter, even when those happened to be wrong.
In 4.6, Claude seems to finally “get” the framework, searching the codebase to understand its internals (as opposed to just understanding similar examples) and has given me corrections or pushback – e.g. it warned me (correctly) about cases where I had an unacceptably high chance of hash collisions, and said something like “no, the bug isn’t X, it’s Y” (again correctly) when I was debugging.
I don’t think so – my CLAUDE.md is fairly short (23 lines of text) and consists mostly of code style comments. I also have one skill for set up for using Julia via a REPL. But I don’t think either of these would result in more disagreement/correction.
I’ve used Claude Code in mostly the same way since 4.0, usually either iteratively making detailed plans and then asking it to check off todos one at a time, or saying “here’s a big, here’s how to reproduce it, figure out what’s going on.”
I also tend to write/speak with a lot of hedging, so that might make Claude more likely to assume my instructions are wrong.
Opus 4.6 has been notably better at correcting me. Today, there were two instances where it pushed back on my plan, and proposed better alternatives both times. That was very rare with Opus 4.5 (maybe once a week?) and nonexistent before 4.5
What sort of project?
I move data around and crunch numbers at a quant hedge fund. There are some aspects that make our work somewhat resistant to LLMs normally: we use a niche language (Julia) and a custom framework. Typically, when writing framework related code, I’ve given Claude Code very specific instructions and it’s followed them to the letter, even when those happened to be wrong.
In 4.6, Claude seems to finally “get” the framework, searching the codebase to understand its internals (as opposed to just understanding similar examples) and has given me corrections or pushback – e.g. it warned me (correctly) about cases where I had an unacceptably high chance of hash collisions, and said something like “no, the bug isn’t X, it’s Y” (again correctly) when I was debugging.
I’m surprised to see this given the difficulty I’ve had with 4.6. Did you do anything to elicit this behavior?
I don’t think so – my CLAUDE.md is fairly short (23 lines of text) and consists mostly of code style comments. I also have one skill for set up for using Julia via a REPL. But I don’t think either of these would result in more disagreement/correction.
I’ve used Claude Code in mostly the same way since 4.0, usually either iteratively making detailed plans and then asking it to check off todos one at a time, or saying “here’s a big, here’s how to reproduce it, figure out what’s going on.”
I also tend to write/speak with a lot of hedging, so that might make Claude more likely to assume my instructions are wrong.