we are deliberately seeking to build certainties of mind
I think “deliberately seeking to build” is the wrong way to frame the current paradigm—we’re growing the AIs through a process we don’t fully understand, while trying to steer the external behaviour in the hopes that this corresponds to desirable mind structures.
If we were actually building the AIs, I would be much more optimistic about them coming out friendly.
Not fully understanding things is the default … even non AI software can’t be fully understood if it is complex enough. We already know how to probe systems we don’t understand apriori, through scientific experimentation. You don’t have to get alignment right first time, at least not without the foom/RRSI or incorrigibility assumptions.
The difference with normal software is that at least somebody understands every individual part, and if you collected all those somebodies and locked them in a room for a while they could write up a full explanation. Whereas with AI I think we’re not even like 10% of the way to full understanding.
Also, if you’re trying to align a superintelligence, you do have to get it right on the first try, otherwise it kills you with no counterplay.
I think “deliberately seeking to build” is the wrong way to frame the current paradigm—we’re growing the AIs through a process we don’t fully understand, while trying to steer the external behaviour in the hopes that this corresponds to desirable mind structures.
If we were actually building the AIs, I would be much more optimistic about them coming out friendly.
Not fully understanding things is the default … even non AI software can’t be fully understood if it is complex enough. We already know how to probe systems we don’t understand apriori, through scientific experimentation. You don’t have to get alignment right first time, at least not without the foom/RRSI or incorrigibility assumptions.
The difference with normal software is that at least somebody understands every individual part, and if you collected all those somebodies and locked them in a room for a while they could write up a full explanation. Whereas with AI I think we’re not even like 10% of the way to full understanding.
Also, if you’re trying to align a superintelligence, you do have to get it right on the first try, otherwise it kills you with no counterplay.
That has not been demonstrated.
( “Gestures towards IABIED”
“Gestures towards critiques thereof”)