Adele Lopez comments on ryan_greenblatt’s Shortform

Adele Lopez 14 Mar 2026 21:04 UTC
2 points
0
Sure, it can be evidence of bad (or good) things, but that’s different from whether it’s safer in-and-of-itself. For me, it’s a positive update that Satisficers might be more natural than Maximizers.

For me, it seems really obviously the case that something that gets tired is less dangerous than something that doesn’t, all else equal.

What is your threat model?
- ryan_greenblatt 15 Mar 2026 4:27 UTC
  4 points
  0
  Parent
  I think current AIs having this property is probably slightly differentially harmful for harder-to-check tasks and generally contributes to underelicitation. I don’t have a very strong view on the sign of general underelicitation in current models, but I tenatively think underelicitation is slightly bad overall.