mnarayan
Request for Information on synthetic data evaluations
Causal effect predictiveness of a surrogate asks whether the effects of an intervention on S, the surrogate, is predictive of the effects of the intervention on the gold standard outcome. I’m still looking for a causal effect predictiveness angle on making decisions on synthetic data.
So far, I’ve found two interesting commentaries, one in the Lancet and the other from the FDA on evaluating “synthetic datasets”.
Lancet article
“Although synthetic data address crucial shortages of real-world training data, their overuse might propagate biases, accelerate model degradation, and compromise generalisability across populations. A concerning consequence of the rapid adoption of synthetic data in medical AI is the emergence of synthetic trust—an unwarranted confidence in models trained on artificially generated datasets that fail to preserve clinical validity or demographic realities.”
Suchi Saria says
I agree simulation can help when we are starting with real data and then simulating alternative scenarios (eg additional missingness, different order protocols) for which there is realistic support (often hard for people w/o stats/causal inf background to deduce) but majority of efforts around simulation data that I see are mostly an engineering exercise in generating data without understanding whether the inferences we are drawing from it are valid.
Analytical Validation of Biomarkers is Not the Full Story
mnarayan’s Shortform
Clozapine has been a highly underutilized drug for decades https://psychiatryonline.org/doi/full/10.1176/appi.ps.201700162
40+ years ago when people were investigating the mechanism of action of clozapine, they knew that some of its benefits might have had to do with muscarinic receptors as well. In fact, some of these investigations were done at Pfizer! https://paperpile.com/shared/shUc~GTZ0T3qFvQ84LykSmA
So the story that cobenfy has a brand new mechanism of action isn’t quite right, but it is the first approved drug to have deliberately targeted muscarinic receptors.
Cross-posted note: https://substack.com/@manjarinarayan/note/c-238677736
Hermeneutic Uncertainty Anyone? https://academic.oup.com/rssdat/pages/call-for-papers-uncertainty-in-the-era-of-ai
The idea that we can just put error bars on the outputs of reasoning models, agentic workflows etc.. based on earlier eras of statistical thinking is breaking down. The data generating mechanisms underlying AI outputs and downstream decision making process are massively non-i.i.id and perpetually operating in a loop. I don’t know if “hermeneutic uncertainty” covers the new ground entirely, nor does performative predictions literature [1].
The top machine learning / statistical learning researchers who think about rigorously drawing inferences about the world have a CFP for new ways of statistical thinking in the agentic era.
High-dimensional models operating on massive non-i.i.d. datasets strain traditional techniques, generative objectives prioritize plausibility over correctness, and interactive and multi-agent systems compound uncertainty across sequential dialogues and multiple agents. Yet, the challenge is not solely mathematical. As AI integrates into high-stakes professional domains such as healthcare, law, and education, experts face forms of uncertainty that resist reduction to percentages. A legal professional assessing subtle aspects of human actions, or a teacher evaluating the cultural meaning of a literary interpretation, is navigating hermeneutic uncertainty, where the challenge is contested interpretation. When AI systems force these nuanced judgments into probability scores, they risk atrophying the very professional expertise they are meant to augment.
New Taxonomies of Uncertainty: Developing new definitions of uncertainty adapted to the landscape of modern AI. These include definitions that move beyond the standard statistical distinctions to encompass ethical, cultural, and contextual uncertainties central to professional judgment.
Epistemic vs. Hermeneutic Uncertainty: Distinguishing between “what we don’t know” (epistemic) and “what is open to interpretation” (hermeneutic). How can AI systems signal the latter without falsely quantifying it?
Methodological Innovations: Novel methods for quantifying uncertainty in generative models trained on near-universal datasets, including metrics for semantic uncertainty, and frameworks for tracking reliability across interactive and multi-agent systems.
Visualization & Uncertainty Communication: Moving beyond standard confidence intervals through innovations in visual analytics. We seek designs that help users navigate high-dimensional spaces, signal not only statistical uncertainty but also when outputs are technically sound yet open to interpretation, and link uncertainty to downstream decisions.
Professional Practice & Sense-Making: Participatory frameworks and empirical studies evaluating how uncertainty communication impacts the judgment, accuracy, and agency of human experts in collaborative workflows.
[1]. Performative predictions https://paperpile.com/shared/snFoVQ3GpS_6WTl5Pq5VH1w