Chris_Leong comments on Alignment By Default

Chris_Leong 22 Mar 2021 4:45 UTC
LW: 4 AF: 2
0
AF
Thanks!

Is this why you put the probability as “10-20% chance of alignment by this path, assuming that the unsupervised system does end up with a simple embedding of human values”? Or have you updated your probabilities since writing this post?
- johnswentworth 22 Mar 2021 5:37 UTC
  LW: 6 AF: 4
  0
  AF Parent
  Yup, this is basically where that probability came from. It still feels about right.