Thanks!Is this why you put the probability as “10-20% chance of alignment by this path, assuming that the unsupervised system does end up with a simple embedding of human values”? Or have you updated your probabilities since writing this post?
Yup, this is basically where that probability came from. It still feels about right.
Thanks!
Is this why you put the probability as “10-20% chance of alignment by this path, assuming that the unsupervised system does end up with a simple embedding of human values”? Or have you updated your probabilities since writing this post?
Yup, this is basically where that probability came from. It still feels about right.