Bogdan Ionut Cirstea comments on Managing catastrophic misuse without robust AIs

Bogdan Ionut Cirstea 17 Jan 2024 14:46 UTC
2 points
0
I think similar threat models and similar lines of reasoning might also be useful with respect to (potentially misaligned) ~human-level/not-strongly-superhuman AIs, especially since more complex tasks seem to require more intermediate outputs (that can be monitored).
- ryan_greenblatt 17 Jan 2024 16:16 UTC
  4 points
  1
  Parent
  We strongly agree, see our recent work. As I state in the post “I think the style of work I discuss here has good transfer with the AI control approach”. We have a forthcoming post explaining AI control in more detail.