Mis-Understandings comments on Ada Palmer: Inventing the Renaissance

Mis-Understandings 26 Jan 2026 17:53 UTC
46 points
8
The big problem with pre-x LLMs, especially into the deep past, is sampling bias. That is to say, the set of texts that survive is not the set of texts that were present, which is an extra bias on the already biased data collection for regular LLMs. There is also the temptation to support (especially for placing a LLM as a strategic) with analysis and data that comes after the period, which was collected with a hindsight bias for what is interesting.
- Dmitry Vaintrob 28 Jan 2026 23:59 UTC
  9 points
  4
  Parent
  Arguably the same is true of modern LLMs. Even a base model is not a “generic person” but a “generic text”. The model ranke-4b is also fine-tuned (at least on question formats and to stay in character). So it’s a reconstructed version
  The base-model is an unpolished diamond: it is full of raw potential, but extracting its knowledge is not always an effortless undertaking since it does not respond to questions in a chat-formatted manner.
- Knight Lee 14 Feb 2026 22:24 UTC
  2 points
  0
  Parent
  I agree, training pre-x LLMs would have major costs and weaknesses without fully removing bias. If we’re going to use LLMs it feels much easier to simulate future events instead of past events.
  Though I guess safety training makes it hard for them to reason from the point of view of Putin.