ryan_b comments on Thomas Kwa’s MIRI research experience

ryan_b 4 Oct 2023 13:12 UTC
4 points
0
Question: why is a set of ideas about alignment being adjacent to capabilities only a one-way relationship? More directly, why can’t this mindset be used to pull alignment gains out of capability research?
- Nathaniel Monson 4 Oct 2023 17:28 UTC
  5 points
  4
  Parent
  I think it is 2-way, which is why many (almost all?) Alignment researchers have spent a significant amount of time looking at ML models and capabilities, and have guesses about where those are going.
- Thomas Kwa 5 Oct 2023 0:57 UTC
  4 points
  0
  Parent
  Not sure exactly what the question is, but research styles from ML capabilities have definitely been useful for alignment, e.g. the idea of publishing benchmarks.