Hi David,
As Stuart referenced in his comment to your post here, value extrapolation can be the key to AI alignment *without* using it to deduce the set of human values. See the ‘List of partial failures’ in the original post: With value extrapolation, these approaches become viable.
Let’s give it a reasoning test.
A photo of five minus three coins.
A painting of the last main character to die in the Harry Potter series.
An essay, in correctly spelled English, on the causes of the scientific revolution.
A helpful essay, in correctly spelled English, on how to align artificial superintelligence.