David Scott Krueger comments on Common misconceptions about OpenAI

David Scott Krueger 29 Aug 2022 16:24 UTC
LW: 5 AF: 2
2
AF
- When you say “good things to keep an AI safe” I think you are referring to a goal like “maximize capability while minimizing catastrophic alignment risk.” But in my opinion “don’t give your models access to the internet or anything equally risky” is a bad way to make that tradeoff. I think we really want dumber models doing more useful things, not smarter models that can do impressive stuff with less resources. You can get a tiny bit of safety by making it harder for your model to have any effect on the world, but at the cost of significant capability, and you would have been better off just using a slightly dumber model with more ability to do stuff. This effect is much bigger if you need to impose extreme limitations in order to get any of this “boxing benefit” (as claimed by the quote you are objecting to).
I don’d think the choice is between “smart and boxed” or “less smart and less boxed”. Intelligence (e.g. especially domain knowledge) is not 1-dimensional, boxing is largely a means of controlling what kind of knowledge the AI has. We might prefer AI savants that are super smart about some task-relevant aspects of the world and ignorant about a lot of other strategically-relevant aspects of the world.