Jeremy Gillen comments on ryan_greenblatt’s Shortform

Jeremy Gillen 18 May 2025 17:34 UTC
LW: 3 AF: 1
0
AF
I think different views about the extent to which future powerful AIs will deeply integrate their superhuman abilities versus these abilities being shallowly attached partially drive some disagreements about misalignment risk and what takeoff will look like.
I think this might be wrong when it comes to our disagreements, because I don’t disagree with this shortform.^[1] Maybe a bigger crux is how valuable (1) is relative to (2)? Or the extent to which (2) is more helpful for scientific progress than (1)?
1. ^
  As long as “downstream performance” doesn’t include downstream performance on tasks that themselves involve a bunch of integrating/generalising.
- ryan_greenblatt 18 May 2025 17:45 UTC
  LW: 4 AF: 3
  0
  AF Parent
  I don’t think this explains our disagreements. My low confidence guess is we have reasonably similar views on this. But, I do think it drives parts of some disagreements between me and people who are much more optimisitic than me (e.g. various not-very-concerned AI company employees).
  
  I agree the value of (1) vs (2) might also be a crux in some cases.
  - yams 18 May 2025 18:04 UTC
    1 point
    0
    Parent
    Is the crux that the more optimistic folks plausibly agree (2) is cause for concern, but believe that mundane utility can be reaped with (1), and they don’t expect us to slide from (1) into (2) without noticing?