Pandemic Prediction Checklist: H5N1
Pandemic Prediction Checklist: Monkeypox
Correlation may imply some sort of causal link.
For guessing its direction, simple models help you think.
Controlled experiments, if they are well beyond the brink
Of .05 significance will make your unknowns shrink.
Replications show there’s something new under the sun.
Did one cause the other? Did the other cause the one?
Are they both controlled by what has already begun?
Or was it their coincidence that caused it to be done?
Whereof one cannot speak, thereof one must be silent.
- Ludwig Wittgenstein
LLMs hallucinate because they are not trained on silence. Almost all text ever produced is humans saying what they know. The normal human reaction to noticing their own confusion is to shut up and study or distance themselves from the topic entirely. We either learn math or go through life saying “I hate math.” We aren’t forced to answer other people’s math questions, come what may.
We casually accept catastrophic forgetting and apply the massive parameter count of the human brain to a tiny fraction of the total knowledge base we accept LLMs to master.
The record of how humans react to their own uncertainty is not in the training data.
Reasoning models attempt to monitor their own epistemic state. But because they have to return answers quickly and cannot modify their own weights, they face serious barriers to achieving human reliability. They don’t know what it feels like to not know something or design a productive program of research and study.
Despite all this, what they can do now is extremely impressive, and I’m glad to have access to them at their current level of capability.
They probably do not make me more productive. They can be misleading, and they also enable me to dig into topics and projects where I’m less familiar and thus more vulnerable to being mislead. They enable me to explore interesting side projects and topics, which takes time away from my main work.
They make me less tolerant of inadequacy, both because they point out flaws in my code or reasoning and because they incline me toward perfectionism rather than constraining project scope to be within my capabilities and time budget. They will gold-plate a bad idea. But I’ve never had an LLM suggest I take a step back and look for an easier solution than the one I’m considering.
They’ll casually suggest writing complex custom-built solutions to problems but never propose cutting features because it would be too complicated to execute unless I specifically ask, and then I never feel like I’m getting “independent judgment,” just sycophancy.
They mostly rely on my descriptions of my work and ideas. They can observe my level of progress, but not my rate of progress. They see only the context I give them, and I’m too lazy to always give them all the context updates. They are also far more available than my human advisors. As such, the LLM’s role in my life is to encourage and enable scope creep, while advisors are the brakes.