Long term, trying to target the particular misaligned action won’t work.
What could actually work then? I guess that it would require OpenAI or other AGI/ASI companies to rethink the training paradigm? And what could be the new paradigm then?
What could actually work then? I guess that it would require OpenAI or other AGI/ASI companies to rethink the training paradigm? And what could be the new paradigm then?