[Question] Would more model evals teams be good?

If multiple model evaluations organizations similar to the ARC evals team emerge and compete for big tech clients, does this drive a “race to the bottom,” where more lenient evals teams get larger market share? I’m imaging a scenario in which there are government restrictions on deployment/​excess training of unevaluated models. Big tech companies would have to receive a certificate from a government-approved evaluator before training their model past a certain point, or with certain “dangerous” methods (e.g., RLHF without adequate regularizers), or deploying it off the training distribution.

In an alternative scenario, big tech companies are motivated to seek “penetration testing” from evals teams because this prevents deployment failures that tank stocks. Or, more optimistically, perhaps the big tech companies take the risk of emergent misaligned powerseeking behavior seriously and genuinely want to avoid (apparent) tail risks. In this scenario, perhaps multiple evals teams compete to be the best at finding bugs in models. However, if the most easily detectable bugs are not critical to MAPS, which could be rare and catastrophic, perhaps this competitive environment also entails a “race to the bottom” in terms of the quality of model evaluations?