Covid was a big learning experience for me, but I’d like to think about more than one example. Covid is interesting because, compared to my examples of birth control and animal-free meat, it seems like with covid humanity smashed the technical problem out of the park, but still overall failed by my lights because of the political situation.
How likely does it seem that we could get full marks on solving alignment but still fail due to politics? I tend to think of building a properly aligned AGI as a straightforward win condition, but that’s not a very deeply considered view. I guess we could solve it on a whiteboard somewhere but for political reasons it doesn’t get implemented in time?
I think this is a potential scenario, and if we remove existential risk from the equation, it is somewhat probable as a scenario, where we basically have solved alignment, and yet AI governance craps out in different ways.
I think this way primarily because I tend to think that value alignment is really easy, much easier than LWers generally think, and I think this because most of the complexity of value learning is offloadable to the general learning process, with only very weak priors being required.
Putting it another way, I basically disagree with the implicit premise on LW that being capable of learning is easier than being aligned to values, at most they’re comparably or a little more difficult.
More generally, I think it’s way easier to be aligned with say, not killing humans, than to actually have non-trivial capabilities, at least for a given level of compute, especially at the lower end of compute.
In essence, I believe there’s simple tricks to aligning AIs, while I see no reason to expect a simple trick to make governments be competent at regulating AI.
Covid was a big learning experience for me, but I’d like to think about more than one example. Covid is interesting because, compared to my examples of birth control and animal-free meat, it seems like with covid humanity smashed the technical problem out of the park, but still overall failed by my lights because of the political situation.
How likely does it seem that we could get full marks on solving alignment but still fail due to politics? I tend to think of building a properly aligned AGI as a straightforward win condition, but that’s not a very deeply considered view. I guess we could solve it on a whiteboard somewhere but for political reasons it doesn’t get implemented in time?
I think this is a potential scenario, and if we remove existential risk from the equation, it is somewhat probable as a scenario, where we basically have solved alignment, and yet AI governance craps out in different ways.
I think this way primarily because I tend to think that value alignment is really easy, much easier than LWers generally think, and I think this because most of the complexity of value learning is offloadable to the general learning process, with only very weak priors being required.
Putting it another way, I basically disagree with the implicit premise on LW that being capable of learning is easier than being aligned to values, at most they’re comparably or a little more difficult.
More generally, I think it’s way easier to be aligned with say, not killing humans, than to actually have non-trivial capabilities, at least for a given level of compute, especially at the lower end of compute.
In essence, I believe there’s simple tricks to aligning AIs, while I see no reason to expect a simple trick to make governments be competent at regulating AI.