Isn’t there a third way out? Name the circumstances under which your models break down.
e.g. “I’m 90% confident that if OpenAI built AGI that could coordinate AI research with 1/10th the efficiency of humans, we would then all die.
My assessment is contingent on a number of points, like the organization displaying similar behaviour wrt scaling and risks, cheap inference costs allowing research to be scaled in parallel, and my model of how far artificial intelligence can bootstrap. You can ask me questions about how I think it would look if I were wrong about those.”
I think it’s good practice to name ways your models can breakdown that you think are plausible, and also ways that your conversational partners may think are plausible.
e.g. even if I didn’t think it would be hard for AGI to bootstrap, if I’m talking to someone for whom that’s a crux, it’s worth laying out that I’m treating that as a reliable step. It’s better yet if I clarify whether it’s a crux for my model that bootstrapping is easy. (I can in fact imagine ways that everything takes off even if bootstrapping is hard for the kind of AGI we make, but these will rely more on the human operators continuing to make dangerous choices.)
I may be misunderstanding but does this miss the category of unknown unknowns that are captured by Knightian uncertainty? For extremely out of distribution scenarios you may not be able to accurately anticipate even the general category of ways things could go wrong.
I can talk about what is in-distribution in terms of a bunch of finite components, and thereby name the cases that are out of distribution: those in which my components break.
(This seems like an advantage inside views have, they come with limits attached, because they build a distribution out of pieces that you can tell are broken when they don’t match reality.)
My example doesn’t talk about the probability I assign on “crazy thing I can’t model”, but such a thing would break something like my model of “who is doing what with the AI code by default”.
Maybe it would have been better of me to include a case for “and reality might invalidate my attempt at meta-reasoning too”?
If you know what it’ll look like if you were wrong about any of these, then that would be uncertainty that’s captured by your model.
Obviously there’s a spectrum here, but it sounds like you’re describing something significantly more specific than the central examples of Knightian uncertainty in my mind. E.g. to me Knightian uncertainty looks less like uncertainty about how far AI can bootstrap, and more like uncertainty about whether our current conception of “bootstrapping” will even turn out to be a meaningful and relevant concept.
My nearby reply is most of my answer here. I know how to tell when reality is off-the-rails wrt to my model, because my model is made of falsifiable parts. I can even tell you about what those parts are, and about the rails I’m expecting reality to stay on.
When I try to cache out your example, “maybe the whole way I’m thinking about bootstrapping isn’t meaningful/useful”, it doesn’t seem like it’s outside my meta-model? I don’t think I have to do anything differently to handle it?
Specifically, my “bootstrapping” concept comes with some concrete pictures of how things go. I currently find the concept “meaningful/useful” because I expect these concrete pictures to be instantiated in reality. (Mostly because I think expect reality to admit the “bootstrapping” I’m picturing, and I expect advanced AI to be able to find it). If reality goes off-my-rails about my concept mattering, it will be because things don’t apply in the way I’m thinking, and there were some other pathways I should have been attending to instead.
Isn’t there a third way out? Name the circumstances under which your models break down.
e.g. “I’m 90% confident that if OpenAI built AGI that could coordinate AI research with 1/10th the efficiency of humans, we would then all die. My assessment is contingent on a number of points, like the organization displaying similar behaviour wrt scaling and risks, cheap inference costs allowing research to be scaled in parallel, and my model of how far artificial intelligence can bootstrap. You can ask me questions about how I think it would look if I were wrong about those.”
I think it’s good practice to name ways your models can breakdown that you think are plausible, and also ways that your conversational partners may think are plausible.
e.g. even if I didn’t think it would be hard for AGI to bootstrap, if I’m talking to someone for whom that’s a crux, it’s worth laying out that I’m treating that as a reliable step. It’s better yet if I clarify whether it’s a crux for my model that bootstrapping is easy. (I can in fact imagine ways that everything takes off even if bootstrapping is hard for the kind of AGI we make, but these will rely more on the human operators continuing to make dangerous choices.)
I may be misunderstanding but does this miss the category of unknown unknowns that are captured by Knightian uncertainty? For extremely out of distribution scenarios you may not be able to accurately anticipate even the general category of ways things could go wrong.
Idk if it’s actually missing that?
I can talk about what is in-distribution in terms of a bunch of finite components, and thereby name the cases that are out of distribution: those in which my components break.
(This seems like an advantage inside views have, they come with limits attached, because they build a distribution out of pieces that you can tell are broken when they don’t match reality.)
My example doesn’t talk about the probability I assign on “crazy thing I can’t model”, but such a thing would break something like my model of “who is doing what with the AI code by default”.
Maybe it would have been better of me to include a case for “and reality might invalidate my attempt at meta-reasoning too”?
Your comment has made me think I’ve misunderstood something fundamental.
Hm, sorry! I don’t think a good reply on my part should do that :P
I think I’m rejecting a certain mental stance toward unknown-unknowns, and I don’t think I’m clearly pointing at it yet.
tl;dr I think you can improve on “my models might break for an unknown reason” if you can name the main categories of model-breaking unknowns
If you know what it’ll look like if you were wrong about any of these, then that would be uncertainty that’s captured by your model.
Obviously there’s a spectrum here, but it sounds like you’re describing something significantly more specific than the central examples of Knightian uncertainty in my mind. E.g. to me Knightian uncertainty looks less like uncertainty about how far AI can bootstrap, and more like uncertainty about whether our current conception of “bootstrapping” will even turn out to be a meaningful and relevant concept.
My nearby reply is most of my answer here. I know how to tell when reality is off-the-rails wrt to my model, because my model is made of falsifiable parts. I can even tell you about what those parts are, and about the rails I’m expecting reality to stay on.
When I try to cache out your example, “maybe the whole way I’m thinking about bootstrapping isn’t meaningful/useful”, it doesn’t seem like it’s outside my meta-model? I don’t think I have to do anything differently to handle it?
Specifically, my “bootstrapping” concept comes with some concrete pictures of how things go. I currently find the concept “meaningful/useful” because I expect these concrete pictures to be instantiated in reality. (Mostly because I think expect reality to admit the “bootstrapping” I’m picturing, and I expect advanced AI to be able to find it). If reality goes off-my-rails about my concept mattering, it will be because things don’t apply in the way I’m thinking, and there were some other pathways I should have been attending to instead.