It’s hard to evaluate anything you’re writing now without seeing the formalization of virtues which you’ve yet to publish!
Buthis struck me as slightly odd:
Situational awareness becomes a feature not just a challenge. Today we try to test AI whether AIs will obey instructions in often-implausible hypothetical scenarios. But as AIs get more intelligent, trying to hide their actual situation from them will become harder and harder. Yet this doesn’t have to just be a disadvantage, but rather also something we can benefit from. Virtues are inherently context-dependent and require judgment about how to apply them. Therefore the harder it is to deceive AIs about the context they’re deployed in, the more robustly they’d be able to act on virtues if they wanted to.
What does this mean? What kinds of context-dependence do virtues have that isn’t also equally true of consequentialist ethics and deontological ethics?
Yeah, this was very vague, thanks for pointing that out. Have rewritten as follows:
Situational awareness becomes a feature not just a challenge. Today we try to test AI whether AIs will obey instructions in often-implausible hypothetical scenarios. But as AIs get more intelligent, trying to hide their actual situation from them will become harder and harder. However, the benefit of this is that we’ll be able to align them to values which require them to know about their situation. For example, following an instruction given by the president might be better (or worse) than following an instruction given by a typical person. And following an instruction given to many AIs might be better (or worse) than following an instruction that’s only given to one AI. Situationally aware AIs will by default know which case they’re in. Deontological values don’t really account for such distinctions: you should follow deontology no matter who or where you are. Corrigibility does, but only in a limited way (e.g. distinguishing between authorized users and non-authorized users). Conversely, virtues and consequentialist values are approaches which allow AIs to apply their situational awareness to make flexible choices.
It’s hard to evaluate anything you’re writing now without seeing the formalization of virtues which you’ve yet to publish!
Buthis struck me as slightly odd:
Situational awareness becomes a feature not just a challenge. Today we try to test AI whether AIs will obey instructions in often-implausible hypothetical scenarios. But as AIs get more intelligent, trying to hide their actual situation from them will become harder and harder. Yet this doesn’t have to just be a disadvantage, but rather also something we can benefit from. Virtues are inherently context-dependent and require judgment about how to apply them. Therefore the harder it is to deceive AIs about the context they’re deployed in, the more robustly they’d be able to act on virtues if they wanted to.
What does this mean? What kinds of context-dependence do virtues have that isn’t also equally true of consequentialist ethics and deontological ethics?
Yeah, this was very vague, thanks for pointing that out. Have rewritten as follows: