Dusto comments on We should start looking for scheming “in the wild”

Dusto 9 Mar 2025 0:45 UTC
1 point
0
One side consideration: In relation to your ongoing work around model self-awareness of eval settings, is there any indication that models deployed as a service, business, etc have their behaviour naturally logged as part of quality assurance, and do the models have any understanding or expectation that this is occurring?