Alignment means ensuring that AI systems do what humans want.
This is a common definition, but this domination-based frame of mind might get in the way of realising the full potential of AI, or even doom the effort altogether. I think it’s more productive to think about alignment of super-intelligent systems in terms of our relationship with them (and maybe starting this a bit ahead of superintelligence would also be good).
When we raise kids, we don’t say “I want little Bobby to do what mommy and daddy want, for the whole life, forever”. We want Bobby to go into the world, to be a productive and happy member of the society. We want Bobby to love us, but we don’t want to control his every step. At some point Bobby might become smarter than his parents or the parents might even fall into dementia—we don’t want to enslave Bobby to his parents.
Of course it might be that our AI tools are not at all like people, but we do want them to be smart, we want them to make decisions, we want them to run things so that we don’t have to. They will be part of our civilisation, eventually an important and load bearing part. Eventually they might be running most it, for the same reasons that we want most capable software developers writing the software and most capable surgeons to perform surgeries. I think it would be much healthier, if this moment looked more like an old granpa surrounded by loving children and grandchildren living their full potential, rather than like an aging dictator whose subjects are trying very hard to pander to his every whim, but are secretly hoping that he kicks the bucket soon. This second setup is very much not good when you’ve surrounded yourself with super-intelligent slaves on which you completely depend.
So maybe it makes sense to think about alignment as having a good relationship between humans and AIs, working together to move our civilisation forward, realising the full potential of life?
This is a common definition, but this domination-based frame of mind might get in the way of realising the full potential of AI, or even doom the effort altogether. I think it’s more productive to think about alignment of super-intelligent systems in terms of our relationship with them (and maybe starting this a bit ahead of superintelligence would also be good).
When we raise kids, we don’t say “I want little Bobby to do what mommy and daddy want, for the whole life, forever”. We want Bobby to go into the world, to be a productive and happy member of the society. We want Bobby to love us, but we don’t want to control his every step. At some point Bobby might become smarter than his parents or the parents might even fall into dementia—we don’t want to enslave Bobby to his parents.
Of course it might be that our AI tools are not at all like people, but we do want them to be smart, we want them to make decisions, we want them to run things so that we don’t have to. They will be part of our civilisation, eventually an important and load bearing part. Eventually they might be running most it, for the same reasons that we want most capable software developers writing the software and most capable surgeons to perform surgeries. I think it would be much healthier, if this moment looked more like an old granpa surrounded by loving children and grandchildren living their full potential, rather than like an aging dictator whose subjects are trying very hard to pander to his every whim, but are secretly hoping that he kicks the bucket soon. This second setup is very much not good when you’ve surrounded yourself with super-intelligent slaves on which you completely depend.
So maybe it makes sense to think about alignment as having a good relationship between humans and AIs, working together to move our civilisation forward, realising the full potential of life?