Related: you can have different beliefs about AI’s of different scales. Seeing that deception capabilities increase with scale should make you suspicious of larger models even if you see that the larger models are worse at deceptive capabilities. Whereas it would be more reasonable to take “lack of deceptive capabilities” at face value in large models if you saw the same in models of all smaller sizes (the way that falls through is if deceptive capabilities develop discontinuously, so that there’s a jump from nothing to fully capable of fooling you).
Related: you can have different beliefs about AI’s of different scales. Seeing that deception capabilities increase with scale should make you suspicious of larger models even if you see that the larger models are worse at deceptive capabilities. Whereas it would be more reasonable to take “lack of deceptive capabilities” at face value in large models if you saw the same in models of all smaller sizes (the way that falls through is if deceptive capabilities develop discontinuously, so that there’s a jump from nothing to fully capable of fooling you).