A theory or hypothesis is falsifiable if it can be logically contradicted by an empirical test.
Whereas your definition is:
Falsifiability is a symmetric two-place relation; one cannot say “X is unfalsifiable,” except as shorthand for saying “X and Y make the same predictions,” and thus Y is equally unfalsifiable.
In one of the examples I gave earlier:
Theory X: blah blah and therefore the sky is green
Theory Y: blah blah and therefore the sky is not green
Theory Z: blah blah and therefore the sky could be green or not green.
None of X, Y, or Z are Unfalsifiable-Daniel with respect to each other, because they all make different predictions. However, X and Y are Falsifiable-Wikipedia, whereas Z is Unfalsifiable-Wikipedia.
MIRI researchers rarely provide any novel predictions about what will happen before AI doom, making their theories of doom appear unfalsifiable.
Barnett is using something like the Wikipedia definition of falsifiability here. It’s unfair to accuse him of abusing or misusing the concept when he’s using it in a very standard way.
So, by the Wikipedia definition, it seems that all the mainstream theories of cosmology are unfalsifiable, because they allow for tiny probabilities of boltmann brains etc. with arbitrary experiences. There is literally nothing you could observe that would rule them out / logically contradict them.
Also, in practice, it’s extremely rare for a theory to be ruled out or even close-to-ruled out from any single observation or experiment. Instead, evidence accumulates in a bunch of minor and medium-sized updates.
I think cosmology theories have to be phrased as including background assumptions like “I am not a Boltzmann brain” and “this is not a simulation” and such. Compare Acknowledging Background Information with P(Q|I) for example. Given that, they are Falsifiable-Wikipedia.
I view Falsifiable-Wikipedia in a similar way to Occam’s Razor. The true epistemology has a simplicity prior, and Occam’s Razor is a shadow of that. The true epistemology considers “empirical vulnerability” / “experimental risk” to be positive. Possibly because it falls out of Bayesian updates, possibly because they are “big if true”, possibly for other reasons. Falsifiability is a shadow of that.
In that context, if a hypothesis makes no novel predictions, and the predictions it makes are a superset of the predictions of other hypotheses, it’s less empirically vulnerable, and in some relative sense “unfalsifiable”, compared to those other hypotheses.
I personally wouldn’t include it, because essentially everything (given a powerful enough model of computation) could be simulated, and this is why the simulation hypothesis is so bad in casual discourse: It explains everything, which means it explains nothing that is specific to our universe:
Also note that Barnett said “any novel predictions” which is not part of the wikipedia definition of falsifiability right? The wikipedia definition doesn’t make reference to an existing community of scientists who already made predictions, such that a new hypothesis can be said to have made novel vs. non-novel predictions.
I totally agree btw that it matters sociologically who is making novel predictions and who is sticking with the crowd. And I do in fact ding MIRI points for this relative to some other groups. However I think relative to most elite opinion-formers on AGI matters, MIRI performs better than average on this metric.
But note that this ‘novel predictions’ metric is about people/institutions, not about hypotheses.
However I think relative to most elite opinion-formers on AGI matters, MIRI performs better than average on this metric.
Agree with this, with the caveat that I think all of their rightness relative to others fundamentally was in believing that short timelines were plausible enough, combined with believing in AI being the most major force of the 21st century by far, compared to other technologies, and basically a lot of their other specific predictions are likely to be pretty wrong.
I like this comment here about a useful comparison point to MIRI, where physicists were right about the higgs boson existing, but wrong on the theories like supersymmetry where people expected the higgs mass to be naturally stabilized, and assuming supersymmetry is correct for our universe, the theory cannot stabilize the mass of the higgs, or solve the hierarchy problem:
I think I agree with this—but do you see how it makes me frustrated to hear people dunk on MIRI’s doomy views as unfalsifiable? Here’s what happened in a nutshell:
MIRI: “AGI is coming and it will kill everyone.” Everyone else: “AGI is not coming and if it did it wouldn’t kill everyone.” time passes, evidence accumulates... Everyone else: “OK, AGI is coming, but it won’t kill everyone” Everyone else: “Also, the hypothesis that it won’t (edit: I meant will) kill everyone is unfalsifiable so we shouldn’t believe it.”
Yeah, I think this is actually a problem I see here, though admittedly I often see the hypotheses be vaguely formulated, and I kind of agree with Jotto999 that the verbal forecasts give far too much room for leeway here:
I like that metric, but the metric I’m discussing is more:
Are they proposing clear hypotheses?
Do their hypotheses make novel testable predictions?
Are they making those predictions explicit?
So for example, looking at MIRI’s very first blog post in 2007: The Power of Intelligence. I used the first just to avoid cherry-picking.
Hypothesis: intelligence is powerful. (yes it is)
This hypothesis is a necessary precondition for what we’re calling “MIRI doom theory” here. If intelligence is weak then AI is weak and we are not doomed by AI.
Predictions that I extract:
An AI can do interesting things over the Internet without a robot body.
An AI can get money.
An AI can be charismatic.
An AI can send a ship to Mars.
An AI can invent a grand unified theory of physics.
An AI can prove the Riemann Hypothesis.
An AI can cure obesity, cancer, aging, and stupidity.
Not a novel hypothesis, nor novel predictions, but also not widely accepted in 2007. As predictions they have aged very well, but they were unfalsifiable. If 2025 Claude had no charisma, it would not falsify the prediction that an AI can be charismatic.
I don’t mean to ding MIRI any points here, relative or otherwise, it’s just one blog post, I don’t claim it supports Barnett’s complaint by itself. I mostly joined the thread to defend the concept of asymmetric falsifiability.
Martin Randall extracted the practical consequences of this here:
In that context, if a hypothesis makes no novel predictions, and the predictions it makes are a superset of the predictions of other hypotheses, it’s less empirically vulnerable, and in some relative sense “unfalsifiable”, compared to those other hypotheses.
Thanks for explaining. I think we have a definition dispute. Wikipedia:Falsifiability has:
Whereas your definition is:
In one of the examples I gave earlier:
Theory X: blah blah and therefore the sky is green
Theory Y: blah blah and therefore the sky is not green
Theory Z: blah blah and therefore the sky could be green or not green.
None of X, Y, or Z are Unfalsifiable-Daniel with respect to each other, because they all make different predictions. However, X and Y are Falsifiable-Wikipedia, whereas Z is Unfalsifiable-Wikipedia.
I prefer the Wikipedia definition. To say that two theories produce exactly the same predictions, I would instead say they are indistinguishable, similar to this Phyiscs StackExchange: Are different interpretations of quantum mechanics empirically distinguishable?.
In the ancestor post, Barnett writes:
Barnett is using something like the Wikipedia definition of falsifiability here. It’s unfair to accuse him of abusing or misusing the concept when he’s using it in a very standard way.
Very good point.
So, by the Wikipedia definition, it seems that all the mainstream theories of cosmology are unfalsifiable, because they allow for tiny probabilities of boltmann brains etc. with arbitrary experiences. There is literally nothing you could observe that would rule them out / logically contradict them.
Also, in practice, it’s extremely rare for a theory to be ruled out or even close-to-ruled out from any single observation or experiment. Instead, evidence accumulates in a bunch of minor and medium-sized updates.
I think cosmology theories have to be phrased as including background assumptions like “I am not a Boltzmann brain” and “this is not a simulation” and such. Compare Acknowledging Background Information with P(Q|I) for example. Given that, they are Falsifiable-Wikipedia.
I view Falsifiable-Wikipedia in a similar way to Occam’s Razor. The true epistemology has a simplicity prior, and Occam’s Razor is a shadow of that. The true epistemology considers “empirical vulnerability” / “experimental risk” to be positive. Possibly because it falls out of Bayesian updates, possibly because they are “big if true”, possibly for other reasons. Falsifiability is a shadow of that.
In that context, if a hypothesis makes no novel predictions, and the predictions it makes are a superset of the predictions of other hypotheses, it’s less empirically vulnerable, and in some relative sense “unfalsifiable”, compared to those other hypotheses.
I personally wouldn’t include it, because essentially everything (given a powerful enough model of computation) could be simulated, and this is why the simulation hypothesis is so bad in casual discourse: It explains everything, which means it explains nothing that is specific to our universe:
https://arxiv.org/abs/1806.08747
Also note that Barnett said “any novel predictions” which is not part of the wikipedia definition of falsifiability right? The wikipedia definition doesn’t make reference to an existing community of scientists who already made predictions, such that a new hypothesis can be said to have made novel vs. non-novel predictions.
I totally agree btw that it matters sociologically who is making novel predictions and who is sticking with the crowd. And I do in fact ding MIRI points for this relative to some other groups. However I think relative to most elite opinion-formers on AGI matters, MIRI performs better than average on this metric.
But note that this ‘novel predictions’ metric is about people/institutions, not about hypotheses.
Agree with this, with the caveat that I think all of their rightness relative to others fundamentally was in believing that short timelines were plausible enough, combined with believing in AI being the most major force of the 21st century by far, compared to other technologies, and basically a lot of their other specific predictions are likely to be pretty wrong.
I like this comment here about a useful comparison point to MIRI, where physicists were right about the higgs boson existing, but wrong on the theories like supersymmetry where people expected the higgs mass to be naturally stabilized, and assuming supersymmetry is correct for our universe, the theory cannot stabilize the mass of the higgs, or solve the hierarchy problem:
https://www.lesswrong.com/posts/ZLAnH5epD8TmotZHj/you-can-in-fact-bamboozle-an-unaligned-ai-into-sparing-your#Ha9hfFHzJQn68Zuhq
I think I agree with this—but do you see how it makes me frustrated to hear people dunk on MIRI’s doomy views as unfalsifiable? Here’s what happened in a nutshell:
MIRI: “AGI is coming and it will kill everyone.”
Everyone else: “AGI is not coming and if it did it wouldn’t kill everyone.”
time passes, evidence accumulates...
Everyone else: “OK, AGI is coming, but it won’t kill everyone”
Everyone else: “Also, the hypothesis that it won’t (edit: I meant will) kill everyone is unfalsifiable so we shouldn’t believe it.”
Yeah, I think this is actually a problem I see here, though admittedly I often see the hypotheses be vaguely formulated, and I kind of agree with Jotto999 that the verbal forecasts give far too much room for leeway here:
I like Eli Tyre’s comment here:
https://www.lesswrong.com/posts/ZEgQGAjQm5rTAnGuM/beware-boasting-about-non-existent-forecasting-track-records#Dv7aTjGXEZh6ALmZn
I like that metric, but the metric I’m discussing is more:
Are they proposing clear hypotheses?
Do their hypotheses make novel testable predictions?
Are they making those predictions explicit?
So for example, looking at MIRI’s very first blog post in 2007: The Power of Intelligence. I used the first just to avoid cherry-picking.
Hypothesis: intelligence is powerful. (yes it is)
This hypothesis is a necessary precondition for what we’re calling “MIRI doom theory” here. If intelligence is weak then AI is weak and we are not doomed by AI.
Predictions that I extract:
An AI can do interesting things over the Internet without a robot body.
An AI can get money.
An AI can be charismatic.
An AI can send a ship to Mars.
An AI can invent a grand unified theory of physics.
An AI can prove the Riemann Hypothesis.
An AI can cure obesity, cancer, aging, and stupidity.
Not a novel hypothesis, nor novel predictions, but also not widely accepted in 2007. As predictions they have aged very well, but they were unfalsifiable. If 2025 Claude had no charisma, it would not falsify the prediction that an AI can be charismatic.
I don’t mean to ding MIRI any points here, relative or otherwise, it’s just one blog post, I don’t claim it supports Barnett’s complaint by itself. I mostly joined the thread to defend the concept of asymmetric falsifiability.
Martin Randall extracted the practical consequences of this here: