Daniel Kokotajlo comments on ryan_greenblatt’s Shortform

Daniel Kokotajlo 26 Mar 2026 19:40 UTC
9 points
1
I disagree. Humans who do research projects well have usually had experience doing at least a few research projects of similar length before. Indeed when I was a grad student teaching philosophy courses, it seemed to me that part of the problem was that the college students I was teaching had only ever written short 3-paragraph-style essays before, and never done serious lengthy writing projects that required serious reading and thinking.

It’s true that humans need only a few examples of the thing to get decently good at it, but that’s just the generic human data-efficiency at work rather than specific to long-horizon tasks, I think.
- ryan_greenblatt 26 Mar 2026 21:24 UTC
  3 points
  0
  Parent
  Sure, but the training data thing is sort of tied up with fluid intelligence? Like a method you have available as a human is learning on the task itself and AIs also have this option, but due to garbage sample efficiency it wouldn’t work.
  
  I also still think humans can often generalize to doing tasks of larger scope without prior experience by learning/adapting from this exact experience in a way that AIs don’t.
  - Vladimir_Nesov 26 Mar 2026 22:45 UTC
    7 points
    0
    Parent
    
    Like a method you have available as a human is learning on the task itself and AIs also have this option, but due to garbage sample efficiency it wouldn’t work.
    
    Sample efficiency for RLVR is plausibly good enough if relevant tasks and RL environments can be automatically formulated, which is currently out of reach but it’s unclear for how long it stays this way. And for in-context learning, sample efficiency could get higher if it worked well enough (as learned learning, it’s in principle not constrained by what hardcoded learning can do), it just doesn’t work well with pretraining.
  - Daniel Kokotajlo 26 Mar 2026 21:33 UTC
    6 points
    0
    Parent
    I think I was responding to an earlier version of your comment—now that it says “I half agree” then I think we are mostly on the same page. I also agree that long-horizon fluid intelligence / sample-efficient online learning seems to be a real thing that differs between humans and AIs.