Interesting method! Added to my collection of LLM ancestry detection methods. Here are the other methods I have collected.
https://www.lesswrong.com/posts/cGcwQDKAKbQ68BGuR LLM of the same model can be finetuned via random text
https://www.dbreunig.com/2025/05/30/using-slop-forensics-to-determine-model-ancestry.html LLM of similar ancestry produces similar frequency of slop words
https://fi-le.net/oss/ Using known glitch tokens to identify LLMs/Encoders
Interesting method! Added to my collection of LLM ancestry detection methods. Here are the other methods I have collected.
https://www.lesswrong.com/posts/cGcwQDKAKbQ68BGuR
LLM of the same model can be finetuned via random text
https://www.dbreunig.com/2025/05/30/using-slop-forensics-to-determine-model-ancestry.html
LLM of similar ancestry produces similar frequency of slop words
https://fi-le.net/oss/
Using known glitch tokens to identify LLMs/Encoders