The AI alignment problem does not look to us like it is fundamentally unsolvable.
I wonder what the basis for this belief is? Rice’ theorem suggests that there is no general algorithm for predicting semantic properties in programs, and that the only way to know what it does is to actually run it.
In the link,
I wonder what the basis for this belief is? Rice’ theorem suggests that there is no general algorithm for predicting semantic properties in programs, and that the only way to know what it does is to actually run it.