(A very natural-seeming extension of this point is “build a general-purpose optimization system to improve the world” → “whoops it develops independent agency and kills everyone / is stolen from you by a sociopath who installs a totalitarian dictatorship”. It always amuses me when the object-level and the meta-level dynamics mirror each other.)
Seconded. I think there is something small-scale-Pythian-ish going on here.
One way to frame this is that a “general-purpose optimization system (that can be used) to improve the world” needs to be strongly retargetable, and the simplest/cheapest/default-est ways to build such a system involve it being also easily corruptible, susceptible to something like “adversarial inputs”, both from the inside (“develops independent agency and kills everyone”) and from the outside (corrupted by external actors, or just “mundane” context disasters).
(A very natural-seeming extension of this point is “build a general-purpose optimization system to improve the world” → “whoops it develops independent agency and kills everyone / is stolen from you by a sociopath who installs a totalitarian dictatorship”. It always amuses me when the object-level and the meta-level dynamics mirror each other.)
Seconded. I think there is something small-scale-Pythian-ish going on here.
One way to frame this is that a “general-purpose optimization system (that can be used) to improve the world” needs to be strongly retargetable, and the simplest/cheapest/default-est ways to build such a system involve it being also easily corruptible, susceptible to something like “adversarial inputs”, both from the inside (“develops independent agency and kills everyone”) and from the outside (corrupted by external actors, or just “mundane” context disasters).