My current thinking is that
relying on the CoT staying legible because it’s English, and
hoping the (racing) labs do not drop human language when it becomes economically convenient to do so,
were hopes to be destroyed as quickly as possible. (This is not a confident opinion, it originates from 15 minutes of vague thoughts.)
To be clear, I don’t think that in general it is right to say “Doing the right thing is hopeless because no one else is doing it”, I typically prefer to rather “do the thing that if everyone did that, the world would be better”. My intuition is that it makes sense to try to coordinate on bottlenecks like introducing compute governance and limiting flops, but not on a specific incremental improvement of AI techniques, because I think the people thinking things like “I will restrain myself from using this specific AI sub-techinque because it increases x-risk” are not coordinated enough to self-coordinate at that level of detail, and are not powerful enough to have an influence through small changes.
(Again, I am not confident, I can imagine paths were I’m wrong, haven’t worked through them.)
(Conflict of interest disclosure: I collaborate with people who started developing this kind of stuff before Meta.)
In the course of a few months, the functionality I want was progressively added to chatbox, so I’m content with that.