Dan Braun comments on Will working here advance AGI? Help us not destroy the world!

Dan Braun 30 May 2022 10:26 UTC
12 points
7
I have a different intuition here; I would much prefer the alignment team at e.g. DeepMind to be working at DeepMind as opposed to doing their work for some “alignment-only” outfit. My guess is that there is a non-negligible influence that an alignment team can have on a capabilities org in the form of:
- The alignment team interacting with other staff either casually in the office or by e.g. running internal workshops open to all staff (like DeepMind apparently do)
- The org consulting with the alignment team (e.g. before releasing models or starting dangerous projects)
- Staff working on raw capabilities having somewhere easy to go if they want to shift to alignment work
I think the above benefits likely outweigh the impact of the influence in the other direction (such as the value drift from having economic or social incentives linked to capabilities work)
- lc 30 May 2022 20:06 UTC
  3 points
  0
  Parent
  My sense is that this “they’ll encourage higher ups to think what they’re doing is safe” thing is a meme. Misaligned AI, for people like Yann Lecunn, is not even a consideration; they think it’s this stupid uninformed fearmongering. We’re not even near the point that Phillip Morris is, where tobacco execs have to plaster their webpage with “beyond tobacco” slogans to feel good about themselves—Demis Hassabis literally does not care, even a little bit, and adding alignment staff will not affect his decision making whatsoever.
  But shouldn’t we just ask Rohin Shah?
  - P. 31 May 2022 16:49 UTC
    4 points
    1
    Parent
    Even a little bit? Are you sure? https://www.lesswrong.com/posts/ido3qfidfDJbigTEQ/have-you-tried-hiring-people?commentId=wpcLnotG4cG9uynjC