evhub comments on Homogeneity vs. heterogeneity in AI takeoff scenarios

evhub 20 Dec 2020 1:40 UTC
LW: 5 AF: 4
0
AF

Concretely, my best guess is that you need inner alignment, since failure of inner alignment probably produces random goals, which means that multiple inner-misaligned AIs are unlikely to share goals.

I disagree with this. I don’t expect a failure of inner alignment to produce random goals, but rather systematically produce goals which are simpler/faster proxies of what we actually want. That is to say, while I expect the goals to look random to us, I don’t actually expect them to differ that much between training runs, since it’s more about your training process’s inductive biases than inherent randomness in the training process in my opinion.
What links here?
- evhub's comment on Homogeneity vs. heterogeneity in AI takeoff scenarios by evhub (20 Dec 2020 1:42 UTC; 7 points)