Fabien Roger comments on leogao’s Shortform

Fabien Roger 7 May 2026 6:04 UTC
4 points
0
My low confidence guess is that the existence of jailbreaks net increase power concentration:
1. They make it more appealing to restrict and monitor external model access (my understanding is that OpenAI would be much more happy to offer broad ZDR access and to publicly release its models as soon as they were available for internal usage if its models were impossible to jailbreak)
2. They make it harder to have a clean spec-to-model match, which makes it harder to have democratic control over model behavior via specs, and makes it harder to find secret loyalties by something like a secret loyalty red-teaming bug bounty
I think these offset the small potential gains from AI companies giving up on robustly preventing certain kinds of low-impact misuse. (Maybe I am missing some other effects?)