orthonormal comments on orthonormal’s Shortform

orthonormal 4 Apr 2026 16:43 UTC
16 points
0
Anyone consider themselves good enough at coding to assess whether this person’s dunks on the code quality of the leaked Claude Code are valid or whether they’re misunderstanding the purpose? I need something more substantive than “too Mastodon, didn’t read”.
Would also suffice to get links to what well-credentialed code experts currently think about the code quality of the leaked Claude Code.
- Brendan Long 4 Apr 2026 17:16 UTC
  34 points
  5
  Parent
  The complaint about the code for image resizing seems valid and is the exact kind of problem that’s common in AI code (layering special cases on top of functions instead of stepping back to design a coherent system).
  The rest of the complaints are about how the harness works, and I think they miss the point. Obviously, Anthropic would prefer if they could make Claude always do the right thing without assistance, but they can’t, so piling hacks to check if Claude did things and remind it of what it’s supposed to be doing is the (formerly) secret sauce that makes Claude Code work how users want it to.
  This reminds me of writing code to parse data from spreadsheets. You could assume that all of your users are robots who always write dates as UTC ISO 8601 timestamps, but then your product won’t work. The reality is that a “hacky” thousand line spreadsheet parser is better than one that assumes unrealistic behavior, and I think Claude Code is a similar case.
  (I’m only responding to the problems mentioned by that thread. It’s likely there are other problems in this codebase. Also to the extent that some of the code is bad, they’re clearly taking that trade-off on purpose to get more speed, and that’s probably the right choice here.)
- Dan Weinand 4 Apr 2026 21:02 UTC
  24 points
  −2
  Parent
  Senior SWE at Alphabet: the complaints read to me like stylistic nits, and not particularly good ones.
  
  Ex:
  1) As Zack says, the negative keyword regex is a very reasonable way to (extremely quickly & roughly) get a sense of negative sentiment. Not all sentiment analysis is load bearing, so doing something fast & cheap often makes sense.
  2) Complaints about detailed comment explanations is a weird flex. If you are doing something unusual in your code, it is sometimes helpful to include a paragraph explaining why (otherwise later folks need to rederive its purpose).
  3) He laughs at the instructions to not introduce security vulnerabilities (and lists specific types). This is IMO a bad take. Reminding ppl (& LLMs) about common error patterns really does help avoid that.
  
  Some of the code is not ideal (very little code in existence is), but the complaints in question IMO have a worse hit rate than if you asked your favorite LLM to critique the code.
- Zack_M_Davis 4 Apr 2026 17:30 UTC
  24 points
  12
  Parent
  The criticism of the negative keyword regex (“dogs you are LITERALLY RIDING ON A LANGUAGE MODEL what are you even DOING”) is way off-base. LLM queries are expensive! A regex is the right tool to log for QA reasons if the user is cussing at us without wasting tokens.
- SatvikBeri 4 Apr 2026 23:12 UTC
  23 points
  4
  Parent
  1. I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS – this seems silly, but I’ve seen similar (shorter) messages in human codebases and they work.
  2. “Be careful not to introduce security vulnerabilities such as...” – fine
  3. Claude can be used for pentesting – fine
  4. Environment variable for user type with elevated privileges – bad, but unfortunately common
  5. Regex for swear words – seems fine, it’s cheaper than an LLM call, and not actually important enough to deserve one
  6. Subagent to verify another agent’s run – actually good, and the author seems to be misunderstanding why it’s useful
  7. Prompting the model to use a tool call – seems fine? My guess is that this was initially more hardcoded, and when the models got better they found it more effective to switch to an LLM call. And the prompt will likely result in the model debugging if something goes wrong, which is helpful
  8. Long LLM comment – IMO a genuinely helpful comment
  9. Reading the whole file instead of just validating the bytes – this genuinely seems inefficient and wasteful
  10. Several cases of code duplication in slightly different styles – clunky and messy, and one of the big problems with LLM code
  11. System reminder mechanism – seems pretty sketchy
  12. Image example – clunky and messy
  So I would say ⁵⁄₁₂ of his comments point to real problems
- lc 4 Apr 2026 17:09 UTC
  9 points
  3
  Parent
  I mean, the tool works.
- niplav 4 Apr 2026 19:28 UTC
  2 points
  0
  Parent
  There had been omens and portents...
  (Anthropic isn’t exactly suckless.org)
- athom 5 Apr 2026 18:08 UTC
  1 point
  0
  Parent
  2 cents, not based on source code or these specific claims: It’s a terminal app written in javascript, which is often slow/flickery because it redraws the entire screen whenever something changes. Whereas Codex fast-followed & very quickly was rewritten in Rust, and its ux is noticeably snappier.