I also share the intuition that defense is harder than offense.
However, hmmm… What if: E.g.: AGI#1 kills a person. “oh no! they died! irrepairable damage! we can’t bring them back!” Then AGI#2 just brings them back.
Indeed, when the state-of-the-art is capable of fixing a particular kind of damage, then that damage does not need to be covered by “weak semi-alignment” any longer (and might just become a part of routine “economics’ of the AI ecosystem, if some “property rights” or “costs” are involved)...
So, yes, various taboos can gradually get repealed as technology progresses to be able to undo the consequences of the violations of those taboos...
So basically are we mainly concerned about the short transitionary period where it’s possible to actually lose things/people/etc. permanently? (And where we also don’t have the information to reconstruct)
At least, there seems to be growing understanding that “safety of the transition period” and “safety in the limit” are very different, and that “the transition period” is particularly uncertain and difficult to understand and might be the period of particularly high vulnerability.
This is why many people are arguing for shorter timelines (in addition to an argument that shorter timelines make a more smooth take-off more likely)...
And it does seem that we have entered the early part of “the transition period” already, and that various technology-related risks have been gradually increasing lately (in addition to losing things/people/etc. on the daily basis at “the normal rate” anyway).
The bulk of “AI existential safety research” seems to be mostly about “safety in the limit”. And, in particular, what I have been doing in relation to this has been mostly about “safety in the limit” (when there is way more cognitive power in our world compared to the present state).
But it would be good to see more studies trying to model the dynamics of “the transition period”, if that’s at all possible...
I also share the intuition that defense is harder than offense.
However, hmmm… What if: E.g.: AGI#1 kills a person. “oh no! they died! irrepairable damage! we can’t bring them back!” Then AGI#2 just brings them back.
Right!
Indeed, when the state-of-the-art is capable of fixing a particular kind of damage, then that damage does not need to be covered by “weak semi-alignment” any longer (and might just become a part of routine “economics’ of the AI ecosystem, if some “property rights” or “costs” are involved)...
So, yes, various taboos can gradually get repealed as technology progresses to be able to undo the consequences of the violations of those taboos...
So basically are we mainly concerned about the short transitionary period where it’s possible to actually lose things/people/etc. permanently? (And where we also don’t have the information to reconstruct)
At least, there seems to be growing understanding that “safety of the transition period” and “safety in the limit” are very different, and that “the transition period” is particularly uncertain and difficult to understand and might be the period of particularly high vulnerability.
This is why many people are arguing for shorter timelines (in addition to an argument that shorter timelines make a more smooth take-off more likely)...
And it does seem that we have entered the early part of “the transition period” already, and that various technology-related risks have been gradually increasing lately (in addition to losing things/people/etc. on the daily basis at “the normal rate” anyway).
The bulk of “AI existential safety research” seems to be mostly about “safety in the limit”. And, in particular, what I have been doing in relation to this has been mostly about “safety in the limit” (when there is way more cognitive power in our world compared to the present state).
But it would be good to see more studies trying to model the dynamics of “the transition period”, if that’s at all possible...