Sure, it would be insanely dangerous; it’s basically an AI for hacking. However, if we don’t build it then someone much less pro-social than us certainly will (and probably within the next 10 years), so I figure the only option is for us to get there first. It’s not a choice between someone making it and no-one making it, it’s a choice between us making it and North Korea making it.
In the face of existential risks from AI, whether or not the builder of a dangerous AI is more “prosocial” by some standard of prosociality doesn’t really matter: the point of existential risk is that the good guys can also lose. Under such a calculus, there’s not benefit to trying to beat someone else to building the same, since beating them just destroys the world faster and cuts off time that might have been used to do something safer.
Further, races are self-fulfilling prophecies: if we don’t think there is a race then there won’t be one. So all around we are better off avoiding things that advance capabilities research, especially that rapidly advice it in directions that are likely to cause amplification in directions not clearly aligned with human flourishing.
Have you thought much about the safety/alignment aspects of this approach. This seems very susceptible to Goodharting.
Sure, it would be insanely dangerous; it’s basically an AI for hacking. However, if we don’t build it then someone much less pro-social than us certainly will (and probably within the next 10 years), so I figure the only option is for us to get there first. It’s not a choice between someone making it and no-one making it, it’s a choice between us making it and North Korea making it.
In the face of existential risks from AI, whether or not the builder of a dangerous AI is more “prosocial” by some standard of prosociality doesn’t really matter: the point of existential risk is that the good guys can also lose. Under such a calculus, there’s not benefit to trying to beat someone else to building the same, since beating them just destroys the world faster and cuts off time that might have been used to do something safer.
Further, races are self-fulfilling prophecies: if we don’t think there is a race then there won’t be one. So all around we are better off avoiding things that advance capabilities research, especially that rapidly advice it in directions that are likely to cause amplification in directions not clearly aligned with human flourishing.