As I understand it, when you “talk about the mainline”, you’re supposed to have some low-entropy (i.e. confident) view on how the future goes, such that you can answer very different questions X, Y and Z about that particular future, that are all correlated with each other, and all get (say) > 50% probability. (Idk, as I write this down, it seems so obviously a bad way to reason that I feel like I must not be understanding it correctly.)
But to the extent this is right, I’m actually quite confused why anyone thinks “talk about the mainline” is an ideal to which to aspire. What makes you expect that?
I’ll try to explain the technique and why it’s useful. I’ll start with a non-probabilistic version of the idea, since it’s a little simpler conceptually, then talk about the corresponding idea in the presence of uncertainty.
Suppose I’m building a mathematical model of some system or class of systems. As part of the modelling process, I write down some conditions which I expect the system to satisfy—think energy conservation, or Newton’s Laws, or market efficiency, depending on what kind of systems we’re talking about. My hope/plan is to derive (i.e. prove) some predictions from these conditions, or maybe prove some of the conditions from others.
Before I go too far down the path of proving things from the conditions, I’d like to do a quick check that my conditions are consistent at all. How can I do that? Well, human brains are quite good at constrained optimization, so one useful technique is to look for one example of a system which satisfies all the conditions. If I can find one example, then I can be confident that the conditions are at least not inconsistent. And in practice, once I have that one example in hand, I can also use it for other purposes: I can usually see what (possibly unexpected) degrees of freedom the conditions leave open, or what (possibly unexpected) degrees of freedom the conditions don’t leave open. By looking at that example, I can get a feel for the “directions” along which the conditions do/don’t “lock in” the properties of the system.
(Note that in practice, we often start with an example to which we want our conditions to apply, and we choose the conditions accordingly. In that case, our one example is built in, although we do need to remember the unfortunately-often-overlooked step of actually checking what degrees of freedom the conditions do/don’t leave open to the example.)
What would a probabilistic version of this look like? Well, we have a world model with some (uncertain) constraints in it—i.e. kinds-of-things-which-tend-to-happen, and kinds-of-things-which-tend-to-not-happen. Then, we look for an example which generally matches the kinds-of-things-which-tend-to-happen. If we can find such an example, then we know that the kinds-of-things-which-tend-to-happen are mutually compatible; a high probability for some of them does not imply a low probability for others. With that example in hand, we can also usually recognize which features of the example are very-nailed-down by the things-which-tend-to-happen, and which features have lots of freedom. We may, for instance, notice that there’s some very-nailed-down property which seems unrealistic in the real world; I expect that to be the most common way for this technique to unearth problems.
That’s the role a “mainline” prediction serves. Note that it does not imply the mainline has a high probability overall, nor does it imply a high probability that all of the things-which-tend-to-happen will necessarily occur simultaneously. It’s checking whether the supposed kinds-of-things-which-tend-to-happen are mutually consistent with each other, and it provides some intuition for what degrees of freedom the kinds-of-things-which-tend-to-happen do/don’t leave open.
Man, I would not call the technique you described “mainline prediction”. It also seems kinda inconsistent with Vaniver’s usage; his writing suggests that a person only has one mainline at a time which seems odd for this technique.
Vaniver, is this what you meant? If so, my new answer is that I and others do in fact talk about “mainline predictions”—for me, there was that whole section talking about natural language debate as an alignment strategy. (It ended up not being about a plausible world, but that’s because (a) Eliezer wanted enough concreteness that I ended up talking about the stupidly inefficient version rather than the one I’d actually expect in the real world and (b) I was focused on demonstrating an existence proof for the technical properties, rather than also trying to include the social ones.)
To be clear, I do not mean to use the label “mainline prediction” for this whole technique. Mainline prediction tracking is one way of implementing this general technique, and I claim that the usefulness of the general technique is the main reason why mainline predictions are useful to track.
(Also, it matches up quite well with Nate’s model based on his comment here, and I expect it also matches how Eliezer wants to use the technique.)
The technique you described is in fact very useful
If your probability distribution over futures happens to be such that it has a “mainline prediction”, you get significant benefits from that (similar to the benefits you get from the technique you described).
Man, I would not call the technique you described “mainline prediction”. It also seems kinda inconsistent with Vaniver’s usage; his writing suggests that a person only has one mainline at a time which seems odd for this technique.
Vaniver, is this what you meant?
Uh, I inherited “mainline” from Eliezer’s usage in the dialogue, and am guessing that his reasoning is following a process sort of like mine and John’s. My natural word for it is a ‘particle’, from particle filtering, as linked in various places, which I think is consistent with John’s description. I’m further guessing that Eliezer’s noticed more constraints / implied inconsistencies, and is somewhat better at figuring out which variables to drop, so that his cloud is narrower than mine / more generates ‘mainline predictions’ than ‘probability distributions’.
If so, my new answer is that I and others do in fact talk about “mainline predictions”—for me, there was that whole section talking about natural language debate as an alignment strategy.
Do you feel like you do this ‘sometimes’, or ‘basically always’? Maybe it would be productive for me to reread the dialogue (or at least part of it) and sort sections / comments by how much they feel like they’re coming from this vs. some other source.
As a specific thing that I have in mind, I think there’s a habit of thinking / discourse that philosophy trains, which is having separate senses for “views in consideration” and “what I believe”, and thinking that statements should be considered against all views in consideration, even ones that you don’t believe. This seems pretty good in some respects (if you begin by disbelieving a view incorrectly, your habits nevertheless gather you lots of evidence about it, which can cause you to then correctly believe it), and pretty questionable in other respects (conversations between Alice and Bob now have to include them shadowboxing with everyone else in the broader discourse, as Alice is asking herself “what would Carol say in response to that?” to things that Bob says to her).
When I imagine dialogues generated by people who are both sometimes doing the mainline thing and sometimes doing the ‘represent the whole discourse’ thing, they look pretty different from dialogues generated by people who are both only doing the mainline thing. [And also from dialogues generated by both people only doing the ‘represent the whole discourse’ thing, of course.]
Do you feel like you do this ‘sometimes’, or ‘basically always’?
I don’t know what “this” refers to. If the referent is “have a concrete example in mind”, then I do that frequently but not always. I do it a ton when I’m not very knowledgeable and learning about a thing; I do it less as my mastery of a subject increases. (Examples: when I was initially learning addition, I used the concrete example of holding up three fingers and then counting up two more to compute 3 + 2 = 5, which I do not do any more. When I first learned recursion, I used to explicitly run through an execution trace to ensure my program would work, now I do not.)
If the referent is “make statements that reflect my beliefs”, then it depends on context, but in the context of these dialogues, I’m always doing that. (Whereas when I’m writing for the newsletter, I’m more often trying to represent the whole discourse, though the “opinion” sections are still entirely my beliefs.)
I’ll try to explain the technique and why it’s useful. I’ll start with a non-probabilistic version of the idea, since it’s a little simpler conceptually, then talk about the corresponding idea in the presence of uncertainty.
Suppose I’m building a mathematical model of some system or class of systems. As part of the modelling process, I write down some conditions which I expect the system to satisfy—think energy conservation, or Newton’s Laws, or market efficiency, depending on what kind of systems we’re talking about. My hope/plan is to derive (i.e. prove) some predictions from these conditions, or maybe prove some of the conditions from others.
Before I go too far down the path of proving things from the conditions, I’d like to do a quick check that my conditions are consistent at all. How can I do that? Well, human brains are quite good at constrained optimization, so one useful technique is to look for one example of a system which satisfies all the conditions. If I can find one example, then I can be confident that the conditions are at least not inconsistent. And in practice, once I have that one example in hand, I can also use it for other purposes: I can usually see what (possibly unexpected) degrees of freedom the conditions leave open, or what (possibly unexpected) degrees of freedom the conditions don’t leave open. By looking at that example, I can get a feel for the “directions” along which the conditions do/don’t “lock in” the properties of the system.
(Note that in practice, we often start with an example to which we want our conditions to apply, and we choose the conditions accordingly. In that case, our one example is built in, although we do need to remember the unfortunately-often-overlooked step of actually checking what degrees of freedom the conditions do/don’t leave open to the example.)
What would a probabilistic version of this look like? Well, we have a world model with some (uncertain) constraints in it—i.e. kinds-of-things-which-tend-to-happen, and kinds-of-things-which-tend-to-not-happen. Then, we look for an example which generally matches the kinds-of-things-which-tend-to-happen. If we can find such an example, then we know that the kinds-of-things-which-tend-to-happen are mutually compatible; a high probability for some of them does not imply a low probability for others. With that example in hand, we can also usually recognize which features of the example are very-nailed-down by the things-which-tend-to-happen, and which features have lots of freedom. We may, for instance, notice that there’s some very-nailed-down property which seems unrealistic in the real world; I expect that to be the most common way for this technique to unearth problems.
That’s the role a “mainline” prediction serves. Note that it does not imply the mainline has a high probability overall, nor does it imply a high probability that all of the things-which-tend-to-happen will necessarily occur simultaneously. It’s checking whether the supposed kinds-of-things-which-tend-to-happen are mutually consistent with each other, and it provides some intuition for what degrees of freedom the kinds-of-things-which-tend-to-happen do/don’t leave open.
Man, I would not call the technique you described “mainline prediction”. It also seems kinda inconsistent with Vaniver’s usage; his writing suggests that a person only has one mainline at a time which seems odd for this technique.
Vaniver, is this what you meant? If so, my new answer is that I and others do in fact talk about “mainline predictions”—for me, there was that whole section talking about natural language debate as an alignment strategy. (It ended up not being about a plausible world, but that’s because (a) Eliezer wanted enough concreteness that I ended up talking about the stupidly inefficient version rather than the one I’d actually expect in the real world and (b) I was focused on demonstrating an existence proof for the technical properties, rather than also trying to include the social ones.)
To be clear, I do not mean to use the label “mainline prediction” for this whole technique. Mainline prediction tracking is one way of implementing this general technique, and I claim that the usefulness of the general technique is the main reason why mainline predictions are useful to track.
(Also, it matches up quite well with Nate’s model based on his comment here, and I expect it also matches how Eliezer wants to use the technique.)
Ah, got it. I agree that:
The technique you described is in fact very useful
If your probability distribution over futures happens to be such that it has a “mainline prediction”, you get significant benefits from that (similar to the benefits you get from the technique you described).
Uh, I inherited “mainline” from Eliezer’s usage in the dialogue, and am guessing that his reasoning is following a process sort of like mine and John’s. My natural word for it is a ‘particle’, from particle filtering, as linked in various places, which I think is consistent with John’s description. I’m further guessing that Eliezer’s noticed more constraints / implied inconsistencies, and is somewhat better at figuring out which variables to drop, so that his cloud is narrower than mine / more generates ‘mainline predictions’ than ‘probability distributions’.
Do you feel like you do this ‘sometimes’, or ‘basically always’? Maybe it would be productive for me to reread the dialogue (or at least part of it) and sort sections / comments by how much they feel like they’re coming from this vs. some other source.
As a specific thing that I have in mind, I think there’s a habit of thinking / discourse that philosophy trains, which is having separate senses for “views in consideration” and “what I believe”, and thinking that statements should be considered against all views in consideration, even ones that you don’t believe. This seems pretty good in some respects (if you begin by disbelieving a view incorrectly, your habits nevertheless gather you lots of evidence about it, which can cause you to then correctly believe it), and pretty questionable in other respects (conversations between Alice and Bob now have to include them shadowboxing with everyone else in the broader discourse, as Alice is asking herself “what would Carol say in response to that?” to things that Bob says to her).
When I imagine dialogues generated by people who are both sometimes doing the mainline thing and sometimes doing the ‘represent the whole discourse’ thing, they look pretty different from dialogues generated by people who are both only doing the mainline thing. [And also from dialogues generated by both people only doing the ‘represent the whole discourse’ thing, of course.]
I don’t know what “this” refers to. If the referent is “have a concrete example in mind”, then I do that frequently but not always. I do it a ton when I’m not very knowledgeable and learning about a thing; I do it less as my mastery of a subject increases. (Examples: when I was initially learning addition, I used the concrete example of holding up three fingers and then counting up two more to compute 3 + 2 = 5, which I do not do any more. When I first learned recursion, I used to explicitly run through an execution trace to ensure my program would work, now I do not.)
If the referent is “make statements that reflect my beliefs”, then it depends on context, but in the context of these dialogues, I’m always doing that. (Whereas when I’m writing for the newsletter, I’m more often trying to represent the whole discourse, though the “opinion” sections are still entirely my beliefs.)