What kind of legible architecture would be enough to give me optimism? The most bare-bones would be interpretability into the beliefs and desires of an AI, and the structural knowledge to verify that those beliefs and desires are the true beliefs and desires of the AI.
Yes — exactly. What makes you think people aren’t already working on it?