FWIW, I think jailbreaking is less of a concern than mass surveillance activity being simply indistinguishable from innocuous use, since without surrounding context it could look like ordinary data analysis. Perhaps it could be detected from large-scale patterns of usage, but this would be quite different from settings like bio/cyber, and it seems rough for OpenAI’s first real-world attempt at this to be in a classified ZDR setting, with no meaningful contractual recourse if detection or targeted blocking turns out to be harder than you predict.
I am sympathetic to the case that it could still be worth taking the contract to support the government’s use of AI (modulo not pushing back more on the SCR designation before doing so), but I don’t agree with the presentation of the technical challenge as familiar territory.
I agree that there are qualitative similarities, so perhaps we should be quantitative about it. Assuming for the sake of argument that the DoW were acting in bad faith and plans to use OpenAI’s services to conduct domestic mass surveillance (legally), how likely do you think it is that OpenAI would be able to prevent this? Given the difficulties I mentioned (indistinguishable from innocuous use, problematic only in aggregate, novel setting, classified, ZDR, no meaningful contractual recourse), it would seem like a big stretch to reach ~50% confidence in my opinion, even with considerable effort on OpenAI’s part.
Perhaps you think it’s unlikely that the DoW is acting in bad faith, but if so, it’s good to be clear about whether this is a load-bearing assumption.