[Question] Which AI Safety techniques will be ineffective against diffusion models?

Google showed off Gemini Diffusion at their event yesterday and while I do not have access to the model just yet, based on what I have seen about so far, it seems impressive.

I have been interested in AI Safety for a while and have been trying to keep up with the literature in this space.

According to OpenAI’s March 2025 blog about detecting misbehavior in models -

We believe that CoT monitoring may be one of few tools we will have to oversee superhuman models of the future.

Can we extend CoT monitoring for diffusion-based models as well?
Or do you think the intermediate states will be too incoherent to monitor?



This is my first LW post, and I do not have a lot of hands-on experience with safety techniques, so forgive me if my question is poorly framed.

No answers.