This is a pretty interesting idea. I can imagine this being part of a safety-in-depth approach: not a single method we would rely on but one of many fail-safes along with sandboxing and actually trying to directly address alignment.
This is a pretty interesting idea. I can imagine this being part of a safety-in-depth approach: not a single method we would rely on but one of many fail-safes along with sandboxing and actually trying to directly address alignment.