johnswentworth comments on The case for aligning narrowly superhuman models

johnswentworth 7 Mar 2021 6:50 UTC
LW: 7 AF: 5
0
AF
I see, so it’s basically assuming that problems factor.
- Ajeya Cotra 7 Mar 2021 7:07 UTC
  LW: 7 AF: 3
  0
  AF Parent
  Yeah, in the context of a larger alignment scheme, it’s assuming that in particular the problem of answering the question “How good is the AI’s proposed action?” will factor down into sub-questions of manageable size.