It’s fascinating how my default posture relative to the veracity or lack thereof of LLM responses has changed over time and how different it is depending on my own expertise on a subject.
In 2023 and 2024 I developed a sort of cognitive muscle memory that led me to second guess almost anything an LLM output that I didn’t have enough expertise in myself to verify.
At some point in 2025 that shifted, especially as hallucination rates noticeably declined and my use of LLMs evolved from tinkering with their capabilities to productive utility. Specifically, as I use LLMs more and more for tasks in which I have little expertise, I find myself checking their sources much less frequently and trusting their output unless something has a sort of smell that I can only describe as intuition derived from extended use.
Even in somewhat high-stakes situations, I take more of a mixture-of-experts model ensemble approach by asking several different models the same question rather than independently verifying information.
It reminds me of the early days of Wikipedia when it took time for me to gain confidence that what was there was real, eventually shifting from frequently double-checking citations to assuming that most of the time it’s correct.
Despite what the word seems to suggest, MoE doesn’t actually work that way (“experts” are just small parts of one layer in a multi-layer transformer; the term predates deep learning by a couple of decades so you can’t really blame its authors).
It’s fascinating how my default posture relative to the veracity or lack thereof of LLM responses has changed over time and how different it is depending on my own expertise on a subject.
In 2023 and 2024 I developed a sort of cognitive muscle memory that led me to second guess almost anything an LLM output that I didn’t have enough expertise in myself to verify.
At some point in 2025 that shifted, especially as hallucination rates noticeably declined and my use of LLMs evolved from tinkering with their capabilities to productive utility. Specifically, as I use LLMs more and more for tasks in which I have little expertise, I find myself checking their sources much less frequently and trusting their output unless something has a sort of smell that I can only describe as intuition derived from extended use.
Even in somewhat high-stakes situations, I take more of a
mixture-of-expertsmodel ensemble approach by asking several different models the same question rather than independently verifying information.It reminds me of the early days of Wikipedia when it took time for me to gain confidence that what was there was real, eventually shifting from frequently double-checking citations to assuming that most of the time it’s correct.
(edit to correct mischaracterization of MoE)
Despite what the word seems to suggest, MoE doesn’t actually work that way (“experts” are just small parts of one layer in a multi-layer transformer; the term predates deep learning by a couple of decades so you can’t really blame its authors).
A better wording would be LLM ensembling, as in https://en.wikipedia.org/wiki/Ensemble_learning
Thank you! I’ll be more precise with describing that approach in the future