TsviBT comments on Views on when AGI comes and on strategy to reduce existential risk

TsviBT 11 Jan 2025 22:33 UTC
LW: 5 AF: 4
−6
AF
I did give a response in that comment thread. Separately, I think that’s not a great standard, e.g. as described in the post and in this comment https://www.lesswrong.com/posts/i7JSL5awGFcSRhyGF/shortform-2?commentId=zATQE3Lhq66XbzaWm :

Second, 2024 AI is specifically trained on short, clear, measurable tasks. Those tasks also overlap with legible stuff—stuff that’s easy for humans to check. In other words, they are, in a sense, specifically trained to trick your sense of how impressive they are—they’re trained on legible stuff, with not much constraint on the less-legible stuff (and in particular, on the stuff that becomes legible but only in total failure on more difficult / longer time-horizon stuff).

In fact, all the time in real life we make judgements about things that we couldn’t describe in terms that would be considered well-operationalized by betting standards, and we rely on these judgements, and we largely endorse relying on these judgements. E.g. inferring intent in criminal cases, deciding whether something is interesting or worth doing, etc. I should be able to just say “but you can tell that these AIs don’t understand stuff”, and then we can have a conversation about that, without me having to predict a minimal example of something which is operationalized enough for you to be forced to recognize it as judgeable and also won’t happen to be surprisingly well-represented in the data, or surprisingly easy to do without creativity, etc.
What links here?
- Do confident short timelines make sense? by TsviBT (15 Jul 2025 3:37 UTC; 138 points)
- ryan_greenblatt 11 Jan 2025 23:26 UTC
  LW: 4 AF: 3
  4
  AF Parent
  (Yeah, you responded, but felt not that operationalized and seemed doable to flesh out as you did.)