Zach Stein-Perlman comments on Thoughts on sharing information about language model capabilities

Zach Stein-Perlman 9 Aug 2023 1:59 UTC
2 points
0
Maybe rather than ‘different paths’ Paul just means that capabilities can come from more-powerful-LMs or more-sophisticated-agent-scaffolding. He says:
at a fixed level of capability, I think the more we are relying on LM agents (rather than larger LMs) the safer we are.
I buy something like this, at least. But (I weakly intuit) we’ll almost exclusively be relying on LM agents rather than mere next-token-predictors by default; there’s no need to boost LM agents. And even if that’s good, that doesn’t mean that marginal improvements in LM agents’ sophistication/complexity are safer than marginal improvements in underlying-LM-capability. (I don’t have a take on this—just flagging it as a crux.)
- paulfchristiano 9 Aug 2023 3:59 UTC
  8 points
  3
  Parent
  My guess is that if you hold capability fixed and make a marginal move in the direction of (better LM agents) + (smaller LMs) then you will make the world safer. It straightforwardly decreases the risk of deceptive alignment, makes oversight easier, and decreases the potential advantages of optimizing on outcomes.
  What links here?
  - Which possible AI systems are relatively safe? by Zach Stein-Perlman (21 Aug 2023 17:00 UTC; 42 points)
  - Which paths to powerful AI should be boosted? by Zach Stein-Perlman (23 Aug 2023 16:00 UTC; 5 points)