With alignment, even biological brains don’t have to be competitive.
I would agree with this, but only if by “alignment” you also include risk of misuse, which I don’t generally consider the same problem as “make this machine aligned with the interests of the person controlling it”
Ultimately, alignment is whatever makes turning on an AI a good idea rather than a bad idea. Some pivotal processes need to ensure enough coordination to avoid unilateral initiation of catastrophes. Banning manufacturing of GPUs (or of selling enriched uranium at a mall) is an example of such a process that doesn’t even need AIs to do the work.
Ultimately, alignment is whatever makes turning on an AI a good idea rather than a bad idea.
This is pithy, but I don’t think it’s a definition of alignment that points at a real property of an agent (as opposed to a property of the entire universe, including the agent).
If we have an AI which controls which train goes on which track, and can detect where all the trains in its network are but not whether or not there is anything on the tracks, whether or not this AI is “aligned” shouldn’t depend on whether or not anyone happens to be on the tracks (which, again, the AI can’t even detect).
The “things should be better instead of worse problem” is real and important, but it is much larger than anything that can reasonably be described as “the alignment problem”.
The non-generic part is “turning an AI on”, and “good idea” is an epistemic consideration on part of the designers, not reference to actual outcome in a reality.
I heard that sentence attributed to Yudkowsky on some podcasts, and it makes sense as an umbrella desideratum after pivoting from shared-extrapolated-values AI to pivotal act AI (as described on arbital), since with that goal there doesn’t appear to be a more specific short summary anymore. In context of that sentence as I mentioned it there is discussion of humans with augmented intelligence, so pivotal act (specialized tool) AI is more centrally back on the table (even as it still seems prudent to plan for long AGI timelines in our world as it is, to avoid abandoning that possibility only to arrive at it unprepared).
I would agree with this, but only if by “alignment” you also include risk of misuse, which I don’t generally consider the same problem as “make this machine aligned with the interests of the person controlling it”
Ultimately, alignment is whatever makes turning on an AI a good idea rather than a bad idea. Some pivotal processes need to ensure enough coordination to avoid unilateral initiation of catastrophes. Banning manufacturing of GPUs (or of selling enriched uranium at a mall) is an example of such a process that doesn’t even need AIs to do the work.
This is pithy, but I don’t think it’s a definition of alignment that points at a real property of an agent (as opposed to a property of the entire universe, including the agent).
If we have an AI which controls which train goes on which track, and can detect where all the trains in its network are but not whether or not there is anything on the tracks, whether or not this AI is “aligned” shouldn’t depend on whether or not anyone happens to be on the tracks (which, again, the AI can’t even detect).
The “things should be better instead of worse problem” is real and important, but it is much larger than anything that can reasonably be described as “the alignment problem”.
The non-generic part is “turning an AI on”, and “good idea” is an epistemic consideration on part of the designers, not reference to actual outcome in a reality.
I heard that sentence attributed to Yudkowsky on some podcasts, and it makes sense as an umbrella desideratum after pivoting from shared-extrapolated-values AI to pivotal act AI (as described on arbital), since with that goal there doesn’t appear to be a more specific short summary anymore. In context of that sentence as I mentioned it there is discussion of humans with augmented intelligence, so pivotal act (specialized tool) AI is more centrally back on the table (even as it still seems prudent to plan for long AGI timelines in our world as it is, to avoid abandoning that possibility only to arrive at it unprepared).