Alek Westover comments on The Case for Mixed Deployment

Alek Westover 12 Sep 2025 18:33 UTC
3 points
1
- Different labs—Models from different labs (Anthropic, OpenAI, DeepMind, Meta, xAI) likely have less correlated failure modes than models from the same lab.
Cross lab monitoring seems like a great idea. As an implementation detail, I think labs should do cross-lab monitoring with AI models that have instructions to not aid the other lab with AI R&D, synthetic data generation, or anything besides monitoring. Of course, if you thought that Gemini was more likely to be aligned than Claude, then you might prefer that Anthropic use Gemini for automating AI R&D, but this is almost certainly infeasible. One other complication worth noting with cross lab monitoring is that the AI models from different labs will probably have different intelligence levels, and different skill profiles. Thus, cross-lab monitoring may be giving us a model which is in between trusted monitoring and untrusted monitoring both in terms of intelligence and “trustedness”. This means might make untrusted monitoring preferable to cross-lab monitoring in some cases (e.g., if Claude is much smarter than all other models), but cross-lab monitoring still seems useful as a tool to add to our monitoring system.
- Cleo Nardo 30 Sep 2025 21:03 UTC
  2 points
  0
  Parent
  I would be very pleased if Anthropic and Openai maintained their collaboration.^[1]^[2]
  Even better: this arrangement could be formalised in a third-party (e.g. government org or non-profit) and other labs could be encouraged/obliged to join.
  1. ^
    OpenAI’s findings from an alignment evaluation of Anthropic models
  2. ^
    Anthropic’s findings from an alignment evaluation of of OpenAI’s models