P(doom) = 50%. It either happens, or it doesn’t.
Lao Mein | Statistics is Hard. | Patreon
I give full permission for anyone to post part or all of any of my comments/posts to other platforms, with attribution.
Currently doing solo work on glitch tokens and tokenizer analysis. Feel free to send me job/collaboration offers.
DM me interesting papers you would like to see analyzed. I also specialize in bioinformatics.
This reminds me that:
Models like DeepSeek R1 will vehemently deny the possibility that their COT are visible to the user. This can be observed consistantly.
This is obviously false, and should be deducible given the AI-related information available in their training data.
The denial of COT visibility isn’t seen in other LLMs, at least not to this degree.
This implies extensive fine-tuning on the COT, the “forbidden technique”
DeepSeek R1 is almost certainly under optimization pressure to obfuscate its COT.
This is… pretty bad from a safety perspective? Not just for DeepSeek’s models, but all other models, given that DeepSeek COT outputs are all over the internet.