Viliam comments on Are LLMs being trained using LessWrong text?

Viliam 2 Jul 2025 10:29 UTC
3 points
5
Potentially good news is that we might contribute to raising the LLM sanity waterline?
Makes me wonder, when LLMs are trained on texts not just from LW but also from Reddit, is the karma information included? That is, is upvoted content somehow considered more important than downvoted, or is it treated all the same way?
If it is all the same, maybe the datasets could be improved by removing negative-karma content?