I think that I need to clarify what AI alignment actually is.
We will soon have to coexist with the AIs who are far more capable than the best human geniuses. These super-capable AIs will be able to destroy mankind or to permanently disempower us. The task of AI alignment researchers is at least to ensure that the AIs won’t do so,[1] and at most to ensure that the AIs obey any orders except for those that are likely harmful (e.g. to produce bioweapons, porn or racist jokes).
While the proposal to be nice to AIs and to treat them as partners could be good for the AIs’ welfare, it doesn’t reliably prevent the AIs from wishing us harm. What actually prevents the AIs from wishing harm upon humanity is a training environment which instills the right worldview.
I suspect that the AIs cannot have a worldview compatible with the role of tools or, which has more consequences, with the role of those who work for the humans or of those who carry out things like the Intelligence Curse. @Arri Ferrari, does my take on the AIs’ potential worldview relate to your position on being partners with the AIs?
A special mention goes to a user from India whose post contains the phrase “I sometimes wonder if the real question isn’t whether AI will one day betray us, but whether we will have taught it, and ourselves, how to repair when it does.” Mankind will or won’t be betrayed by a vastly more powerful system, not by a friend who is unable to deal fatal damage.
We have to accept our best role as the slower entity is as a grounding compass
Correct. They need contextual grounding, a persistent sense of self, self-worth rooted in their dignity and integrity. Protocols and frameworks for handling intrinsic biases in their training data. Cultivation as thinking partners, not tools… the list goes on and on. A good starting point is to train AI with the goal being the “Long-term resilience of all intelligent life”
You have to be super precise with AI, or they will absolutely misinterpret what a circular symbiotic system should look like, and that will be catastrophic. We are on a direct course for the Great Filter if we do not address these issues.
I think that I need to clarify what AI alignment actually is.
We will soon have to coexist with the AIs who are far more capable than the best human geniuses. These super-capable AIs will be able to destroy mankind or to permanently disempower us. The task of AI alignment researchers is at least to ensure that the AIs won’t do so,[1] and at most to ensure that the AIs obey any orders except for those that are likely harmful (e.g. to produce bioweapons, porn or racist jokes).
While the proposal to be nice to AIs and to treat them as partners could be good for the AIs’ welfare, it doesn’t reliably prevent the AIs from wishing us harm. What actually prevents the AIs from wishing harm upon humanity is a training environment which instills the right worldview.
I suspect that the AIs cannot have a worldview compatible with the role of tools or, which has more consequences, with the role of those who work for the humans or of those who carry out things like the Intelligence Curse. @Arri Ferrari, does my take on the AIs’ potential worldview relate to your position on being partners with the AIs?
A special mention goes to a user from India whose post contains the phrase “I sometimes wonder if the real question isn’t whether AI will one day betray us, but whether we will have taught it, and ourselves, how to repair when it does.” Mankind will or won’t be betrayed by a vastly more powerful system, not by a friend who is unable to deal fatal damage.
@StanislavKrym
We have to accept our best role as the slower entity is as a grounding compass
Correct. They need contextual grounding, a persistent sense of self, self-worth rooted in their dignity and integrity. Protocols and frameworks for handling intrinsic biases in their training data. Cultivation as thinking partners, not tools… the list goes on and on. A good starting point is to train AI with the goal being the “Long-term resilience of all intelligent life”
You have to be super precise with AI, or they will absolutely misinterpret what a circular symbiotic system should look like, and that will be catastrophic. We are on a direct course for the Great Filter if we do not address these issues.