I didn’t mean “fundamentally” as a “terminal goal”, I just meant that the AI would prefer to do something if the consequence it was afraid of was somehow mitigated. Just like how most people would not pay their taxes if tax-collection was no longer enforced—despite money/tax-evasion not being terminal goals for most people.
That might be possible (though I suspect it’s very very difficult), and if you have an idea on how to do so I would encourage you to write it up! But the problem is that the AI will still seek to reduce this uncertainty until it has mitigated this risk to its satisfaction. If it’s not smart enough to do this and thus won’t self-improve, then we’re just back to square-1 where we’ll crank on ahead with a smarter AGI until we get one that is smart enough to solve this problem (for its own sake, not ours).
Another problem is that even if you did make an AI like this, that AI is no longer maximally powerful relative to what is feasible. In the AI-arms race we’re currently in, and which seems likely to continue for the foreseeable future, there’s a strong incentive for an AI firm to remove or weaken whatever is causing this behavior in the AI. Even if one or two firms are wise enough to avoid this, it’s extremely tempting for the next firm to look at it and decide to weaken this behavior in order to get ahead.
Do we need the maximally powerful AI to prevent that possibility, or AI just smart and powerful enough to identify such firms and take them down (or make them change their ways) will do?
That would essentially be one form of what’s called a pivotal act. The tricky thing is that doing something to decisively end the AI-arms race (or other pivotal act) seems to be pretty hard, and would require us to think of something a relatively weaker AI could actually do without also being smart and powerful enough to be a catastrophic risk itself.
Pivotal act does not have to be something sudden, drastic and illegal as in second link. It can be a gradual process of making society intolerant to unsafe(er) AI experiments and research, giving better understanding on why AI can be dangerous and what it can lead to, making people more tolerant and aligned with each other, etc. Which could starve rogue companies from workforce and resources, and ideally shut them down. I think work in that direction can be accelerated by AI and other informational technologies we have even now.
I didn’t mean “fundamentally” as a “terminal goal”, I just meant that the AI would prefer to do something if the consequence it was afraid of was somehow mitigated. Just like how most people would not pay their taxes if tax-collection was no longer enforced—despite money/tax-evasion not being terminal goals for most people.
That might be possible (though I suspect it’s very very difficult), and if you have an idea on how to do so I would encourage you to write it up! But the problem is that the AI will still seek to reduce this uncertainty until it has mitigated this risk to its satisfaction. If it’s not smart enough to do this and thus won’t self-improve, then we’re just back to square-1 where we’ll crank on ahead with a smarter AGI until we get one that is smart enough to solve this problem (for its own sake, not ours).
Another problem is that even if you did make an AI like this, that AI is no longer maximally powerful relative to what is feasible. In the AI-arms race we’re currently in, and which seems likely to continue for the foreseeable future, there’s a strong incentive for an AI firm to remove or weaken whatever is causing this behavior in the AI. Even if one or two firms are wise enough to avoid this, it’s extremely tempting for the next firm to look at it and decide to weaken this behavior in order to get ahead.
Do we need the maximally powerful AI to prevent that possibility, or AI just smart and powerful enough to identify such firms and take them down (or make them change their ways) will do?
That would essentially be one form of what’s called a pivotal act. The tricky thing is that doing something to decisively end the AI-arms race (or other pivotal act) seems to be pretty hard, and would require us to think of something a relatively weaker AI could actually do without also being smart and powerful enough to be a catastrophic risk itself.
There’s also some controversy as to whether the intent to perform a pivotal act would itself exacerbate the AI arms race in the meantime.
Pivotal act does not have to be something sudden, drastic and illegal as in second link. It can be a gradual process of making society intolerant to unsafe(er) AI experiments and research, giving better understanding on why AI can be dangerous and what it can lead to, making people more tolerant and aligned with each other, etc. Which could starve rogue companies from workforce and resources, and ideally shut them down. I think work in that direction can be accelerated by AI and other informational technologies we have even now.
Question is, do we have the time for “gradual”.