It’s an argument for why aligning a self-modifying superintelligence requires more than aligning the base LLM. I don’t think it’s impossible, just that there’s another step we need to think through carefully.
It’s an argument for why aligning a self-modifying superintelligence requires more than aligning the base LLM. I don’t think it’s impossible, just that there’s another step we need to think through carefully.