I see lots of LW posts about ai alignment that disagree along one fundamental axis.
About half assume that humans design and current paradigms will determine the course of AGI development. That whether it goes well is fully and completely up to us.
And then, about half assume that the kinds of AGI which survive will be the kind which evolve to survive. Instrumental convergence and darwinism generally point here.
Could be worth someone doing a meta-post, grouping big popular alignment posts they’ve seen by which assumption they make, then briefly explore conditions that favor one paradigm or the other, i.e., conditions under which What AIs will humans make? is the best approach to prediction and conditions under which What AIs will survive the most? is the best approach to prediction.
Human design will determine the course of AGI development, and if we do the right things then whether it goes well is fully and completely up to us. Naturally at the moment we don’t know what the right things are or even how to find them.
If we don’t do the right things (as seems likely), then the kinds of AGI which survive will be the kind which evolve to survive. That’s still largely up to us at first, but increasingly less up to us.
Figuring out how to make sense of both predictive lenses together—human design and selection pressure—would be wise.
So I generally agree, but would maybe go farther on your human design point. It seems to me that”do[ing] the right things” (which enable AGI trajectories to be completely up to us) is so completely unrealistic (eg halting all intra and international AGI competition) that it’d be better for us to focus our attention on futures where human design and selection pressures interact.
I see lots of LW posts about ai alignment that disagree along one fundamental axis.
About half assume that humans design and current paradigms will determine the course of AGI development. That whether it goes well is fully and completely up to us.
And then, about half assume that the kinds of AGI which survive will be the kind which evolve to survive. Instrumental convergence and darwinism generally point here.
Could be worth someone doing a meta-post, grouping big popular alignment posts they’ve seen by which assumption they make, then briefly explore conditions that favor one paradigm or the other, i.e., conditions under which What AIs will humans make? is the best approach to prediction and conditions under which What AIs will survive the most? is the best approach to prediction.
Why not both?
Human design will determine the course of AGI development, and if we do the right things then whether it goes well is fully and completely up to us. Naturally at the moment we don’t know what the right things are or even how to find them.
If we don’t do the right things (as seems likely), then the kinds of AGI which survive will be the kind which evolve to survive. That’s still largely up to us at first, but increasingly less up to us.
Figuring out how to make sense of both predictive lenses together—human design and selection pressure—would be wise.
So I generally agree, but would maybe go farther on your human design point. It seems to me that”do[ing] the right things” (which enable AGI trajectories to be completely up to us) is so completely unrealistic (eg halting all intra and international AGI competition) that it’d be better for us to focus our attention on futures where human design and selection pressures interact.