I initially wanted to nominate this because I somewhat regularly say things like “I think the problem with that line of thinking is that you’re not handling your model uncertainty in the right way, and I’m not good at explaining it, but Richard Ngo has a post that I think explains it well.” Instead of leaving it at that, I’ll try to give an outline of why I found it so helpful. I didn’t put much thought into how to organize this review, it’s centered very much around my particular difficulties, and I’m still confused about some of this, but hopefully it gets across some of what I got out of it.
This post helped me make sense of a cluster of frustrations I’ve had around my thinking and others’ thinking, especially in domains where things are complex and uncertain. The allure of cutting the world up into clear, distinct, and exhaustive possibilities is strong, but doing so doesn’t always lead to clearer thinking. To give a few examples where I’ve seen this lead people astray (choosing not particularly charitable or typical examples, for simplicity):
The origins of covid-19 are zoonotic or a lab leak
AI research will or will not be automated by 2027
AI progress after time t will be super-exponential or it won’t
All of these could be clearly one or the other. And when pressed, sometimes people will admit “Okay, yes, I suppose it could be <3rd thing> or <4th thing>”. But I get the sense that they’re not leaving much room for “5th thing which I failed to consider, and which is much more like one of these things than the others, but importantly different from all of them”.
Previously, I’d misread this as a habit of overconfidence. For example, people would be talking about whether A or B is true and my thinking would come out like this:
roughly15% A is clearly true
roughly 5% B is clearly true
roughly 80% some secret third thing, maybe pretty similar to A or B
These small credences mainly came from A and B looking to me like overly-specific possibilities. I would have a bunch of guesses about how the world works, which claims about it are true, etc. And this cashes out as a few specific models, outcomes, and their credences, along with a large pile of residual uncertainty. So when Alice assigns 75% to A, this seems weirdly overconfident to me.
Usually Alice has some sensible model in which A is either true or not true. For example, many reasonable versions of “AI progress can be adequately represented by <metric>, which will increase as <function> of <inputs>” will yield AI progress that is either definitely super-exponential or definitely not. And Alice might even have credences over variations on <metric>, <function>, and <inputs>, as well as a few different flavors of super-exponential functions. She uses this to assign some probability to super-exponential growth and the rest to exponential or sub-exponential growth. I would see this and think “great, this looks rigorous, and it seems useful to think about this in this way”, so we’d argue about it or I’d make my own model or whatever. And this is often productive. But in the end I would still think “okay, great, to the extent the world is like that model (or family of models), it tells us something useful”, but I’d have limited confidence the model matched reality, so it would still only tug lightly on my overall thinking, I would still think Alice was overconfident, and I would feel mildly disappointed I hadn’t been able to make a better model that I actually believed in.
Part of why this is so disappointing is that it sure feels like I ought to be able to carve up possibilities in a way that allows me to use the rules of probability without having to assign a big lump of probability mass to “I dunno”. First, because figuring out which things could be true is the first step in Bayesian reasoning, plus there’s a sense in which Bayesian reasoning is obviously correct! Second, because I’ve seen smart people I respect cut possibility space up into neat, tidy pieces, apply probabilistic reasoning, and gain a clearer view of the world. And third, because Alice would ask me “Well, if it’s not A or B, what else could it be?” and I would say something like “I don’t know, possibly something like C” where I didn’t think C was particularly likely but I didn’t have any specific dominant alternatives. This kind of challenge can make it feel to me or look to Alice like I’m just reluctant to accept that either A or B will happen.
This post was helpful to me in part because it helped me notice that this dynamic is sometimes the result of others adhering strongly to a proposition-based reasoning in contexts where it’s not appropriate, or me thinking mainly in terms of models, then trying to force this into a propositional framework. For example, I might think something like:
AI capabilities could enhance AI R&D, and this might create a feedback effect, causing an exponential trend to become a super-exponential trend. One way to model this is A, another is B, and a third is C. I’m not sure which is closest to reality or whether I’m thinking about this entirely wrong.
Then I would try to operationalize this within a Bayesian framework with propositions like:
AI progress will follow the trajectory predicted by model A> AI progress will follow the trajectory predicted by model B
Then I’d try to assign probabilities to these propositions, plus some amount to “I’m thinking about this wrong” or “Reality does some other thing”, and figure out how to update them on evidence.
I think it’s fine to attempt this, but in contexts where I almost always put a majority of my credence on “I dunno, some combination of these or some other thing that’s totally different”, I don’t think it’s all that fruitful, and it’s more useful to just do the modeling, report the results, think hard about how to figure out what’s true, and seek evidence where it can be found. As evidence comes in, I can hopefully see which parts of my models are right or wrong, or come up with better models. (This is, by the way, how everything went all the time in my past life as an experimental physicist.)
I don’t have takes many good parts of the post, like what the best approach is to formalizing fuzzy truth values. But I do think that a shift toward more model-based reasoning is good in many domains, especially around topics like AI safety and forecasting, where people often arrive at (I claim) overconfident conclusions.
I initially wanted to nominate this because I somewhat regularly say things like “I think the problem with that line of thinking is that you’re not handling your model uncertainty in the right way, and I’m not good at explaining it, but Richard Ngo has a post that I think explains it well.” Instead of leaving it at that, I’ll try to give an outline of why I found it so helpful. I didn’t put much thought into how to organize this review, it’s centered very much around my particular difficulties, and I’m still confused about some of this, but hopefully it gets across some of what I got out of it.
This post helped me make sense of a cluster of frustrations I’ve had around my thinking and others’ thinking, especially in domains where things are complex and uncertain. The allure of cutting the world up into clear, distinct, and exhaustive possibilities is strong, but doing so doesn’t always lead to clearer thinking. To give a few examples where I’ve seen this lead people astray (choosing not particularly charitable or typical examples, for simplicity):
The origins of covid-19 are zoonotic or a lab leak
AI research will or will not be automated by 2027
AI progress after time t will be super-exponential or it won’t
All of these could be clearly one or the other. And when pressed, sometimes people will admit “Okay, yes, I suppose it could be <3rd thing> or <4th thing>”. But I get the sense that they’re not leaving much room for “5th thing which I failed to consider, and which is much more like one of these things than the others, but importantly different from all of them”.
Previously, I’d misread this as a habit of overconfidence. For example, people would be talking about whether A or B is true and my thinking would come out like this:
roughly15% A is clearly true
roughly 5% B is clearly true
roughly 80% some secret third thing, maybe pretty similar to A or B
These small credences mainly came from A and B looking to me like overly-specific possibilities. I would have a bunch of guesses about how the world works, which claims about it are true, etc. And this cashes out as a few specific models, outcomes, and their credences, along with a large pile of residual uncertainty. So when Alice assigns 75% to A, this seems weirdly overconfident to me.
Usually Alice has some sensible model in which A is either true or not true. For example, many reasonable versions of “AI progress can be adequately represented by <metric>, which will increase as <function> of <inputs>” will yield AI progress that is either definitely super-exponential or definitely not. And Alice might even have credences over variations on <metric>, <function>, and <inputs>, as well as a few different flavors of super-exponential functions. She uses this to assign some probability to super-exponential growth and the rest to exponential or sub-exponential growth. I would see this and think “great, this looks rigorous, and it seems useful to think about this in this way”, so we’d argue about it or I’d make my own model or whatever. And this is often productive. But in the end I would still think “okay, great, to the extent the world is like that model (or family of models), it tells us something useful”, but I’d have limited confidence the model matched reality, so it would still only tug lightly on my overall thinking, I would still think Alice was overconfident, and I would feel mildly disappointed I hadn’t been able to make a better model that I actually believed in.
Part of why this is so disappointing is that it sure feels like I ought to be able to carve up possibilities in a way that allows me to use the rules of probability without having to assign a big lump of probability mass to “I dunno”. First, because figuring out which things could be true is the first step in Bayesian reasoning, plus there’s a sense in which Bayesian reasoning is obviously correct! Second, because I’ve seen smart people I respect cut possibility space up into neat, tidy pieces, apply probabilistic reasoning, and gain a clearer view of the world. And third, because Alice would ask me “Well, if it’s not A or B, what else could it be?” and I would say something like “I don’t know, possibly something like C” where I didn’t think C was particularly likely but I didn’t have any specific dominant alternatives. This kind of challenge can make it feel to me or look to Alice like I’m just reluctant to accept that either A or B will happen.
This post was helpful to me in part because it helped me notice that this dynamic is sometimes the result of others adhering strongly to a proposition-based reasoning in contexts where it’s not appropriate, or me thinking mainly in terms of models, then trying to force this into a propositional framework. For example, I might think something like:
Then I would try to operationalize this within a Bayesian framework with propositions like:
Then I’d try to assign probabilities to these propositions, plus some amount to “I’m thinking about this wrong” or “Reality does some other thing”, and figure out how to update them on evidence.
I think it’s fine to attempt this, but in contexts where I almost always put a majority of my credence on “I dunno, some combination of these or some other thing that’s totally different”, I don’t think it’s all that fruitful, and it’s more useful to just do the modeling, report the results, think hard about how to figure out what’s true, and seek evidence where it can be found. As evidence comes in, I can hopefully see which parts of my models are right or wrong, or come up with better models. (This is, by the way, how everything went all the time in my past life as an experimental physicist.)
I don’t have takes many good parts of the post, like what the best approach is to formalizing fuzzy truth values. But I do think that a shift toward more model-based reasoning is good in many domains, especially around topics like AI safety and forecasting, where people often arrive at (I claim) overconfident conclusions.