‘ai control’ seems like it just increases p doom, right?
since it obv wont scale to agi/asi and will just reduce the warning shots and financial incentives to invest in safety research, make it more economically viable to have misaligned ais, etc
and there’s buzztalk about using misaligned ais to do alignment research, but the incentives to actually do so dont seem to be there and the research itself doesnt seem to be happening—as in, research to actually get closer to a mathematical proof of a method to align a superintelligence such that it wont kill everyone
‘ai control’ seems like it just increases p doom, right?
since it obv wont scale to agi/asi and will just reduce the warning shots and financial incentives to invest in safety research, make it more economically viable to have misaligned ais, etc
and there’s buzztalk about using misaligned ais to do alignment research, but the incentives to actually do so dont seem to be there and the research itself doesnt seem to be happening—as in, research to actually get closer to a mathematical proof of a method to align a superintelligence such that it wont kill everyone