O O comments on 2025 in AI predictions

O O 4 Jan 2026 9:13 UTC
2 points
−2
Evaluation: Correct. AI can perhaps pass the reading comprehension task. But not any 4 of the tasks.
“Reliably construct bug-free code of more than 10,000 lines from natural language specification or by interactions with a non-expert user. [Gluing together code from existing libraries doesn’t count.]”

Also Opus 4.5 can probably pass this one. (10,000 lines of code is not a lot in some languages)
- jessicata 5 Jan 2026 1:55 UTC
  4 points
  1
  Parent
  1. With 10K lines it’s not impressive to write 10K lines. I’m thinking of writing software that would be 10K lines if done by humans.
  2. “bug-free” & “reliably” is a high bar!
- Josh Snider 6 Jan 2026 14:19 UTC
  2 points
  0
  Parent
  I agree Opus can do this with an expert user, but non-expert users might have to wait one or two more models.