[Question] Are LLMs being trained using LessWrong text?

Cedar2 Jul 2025 3:00 UTC

7 points

I wonder if there’s a clear evidence that LessWrong text has been included in LLM training.

Claude seems generally aware of LessWrong, but it’s difficult to distinguish between “this model has been trained on text that mentions LessWrong” and “this model has been trained on text from LessWrong”

Related discussion here, about preventing inclusion: https://www.lesswrong.com/posts/SGDjWC9NWxXWmkL86/keeping-content-out-of-llm-training-datasets?utm_source=perplexity

Cedar2 Jul 2025 3:00 UTC

7 points

4 comments1 min readLW link

avturchin 2 Jul 2025 21:34 UTC
8 points
0
Yes, they can generate a list of comments to a post, putting correct names of prominent LessWrongers and typical styles and topics for each commenter.
Gordon Seidoh Worley 2 Jul 2025 4:21 UTC
7 points
2
Experimentally, Claude knows details about things I specifically wrote on Less Wrong without doing a web search, as well as other Less Wrong content. I’m fairly confident Less Wrong posts are in its training set and not gotten from mirrors other places.
Cedar 2 Jul 2025 3:10 UTC
1 point
0
LessWrong scrape dataset on Hugging face, by NousResearch
https://huggingface.co/datasets/LDJnr/LessWrong-Amplify-Instruct

Viliam 2 Jul 2025 10:29 UTC
3 points
5
Potentially good news is that we might contribute to raising the LLM sanity waterline?
Makes me wonder, when LLMs are trained on texts not just from LW but also from Reddit, is the karma information included? That is, is upvoted content somehow considered more important than downvoted, or is it treated all the same way?
If it is all the same, maybe the datasets could be improved by removing negative-karma content?