I should have phrased my previous comment as a question—what do you see as valuable about the second half without the first half?
Maybe I can mostly answer that question for myself, though.
“Developing Ideas” is related to the first half, but it’s in an inventive frame—so confabulation is much less of a concern. (But we might similarly doubt it and ask for empirical support.)
“Inner Sim” has separate validation presumably (although I haven’t looked into this).
The motivated cognition section?
“Correcting Yourself” has a pretty obvious story about why it should be useful.
Explaining things to others is very generally observed to be useful, and the connection I make to the first half could be seen as spurious or at least not particularly important.
The question of how to do gears thinking more and better is just pretty important all around. But I think my particular remarks are not any more empirically validated than the memory stuff I mentioned.
Understanding others—same as gears. Important, but not a lot to back up my remarks.
I think what I’m going to do is post a question about what could/should go into such a sequence.
Excellent, thanks for the comment! I really appreciate the correction. That’s quite interesting.
The way I currently see it, the second half of the post is more like an assortment of things, which are all tied together by the fact that they elaborate the basic mental movement in the first half of the post. So a post which was just the second half doesn’t seem especially coherent to me.
(I think it’s fine as a “random ideas from Abram in 2020” post, but my impression is you had aspirations towards it serving as a good self-contained-intro-to-rationality)
Ah. Yeah, I guess I conceived of this as pretty solidly somewhere between those extremes, but the title could be misleading towards the second.
I don’t think of it as a collection of random ideas interesting to me right now. I do think of it as a coherent thing. But I certainly don’t intend it to be an introduction to rationality or even the subtopic of truth-orientedness in rationality.
A similar borderline case is death spirals in ants. (Google it for nice pictures/videos of the phenomenon.) Ants may or may not do internal search, but regardless, it seems like this phenomenon could be reproduced without any internal search. The ants implement a search overall via a pattern of behavior distributed over many ants. This “search” behavior has a weird corner case where they literally go into a death spiral, which is quite non-obvious from the basic behavior pattern.
Yes, I agree with that. But (as I’ve said in the past) this formalism doesn’t do it for me. I have yet to see something which strikes me as a compelling argument in its favor.
So in the context of planning by probabilistic inference, instrumental occam seems almost like a bug rather than a feature—the unjustified bias toward simpler policies doesn’t seem to serve a clear purpose. It’s just an assumption.
Granted, the fact that I intuitively feel there should be some kind of instrumental occam is a point in favor of such methods in some sense.
I like this post because I’m fond of using the “what’s the better alternative?” argument in instrumental matters, so it’s good to have an explicit flag of where it fails in epistemic matters. Technically the argument still holds, but the “better alternative” can be a high entropy theory, which often doesn’t rise to saliency as a theory at all.
It’s also a questionable heuristic in instrumental matters, as often it is possible to meaningly critique a policy without yet having a better alternative. But one must be careful to distinguish between these “speculative” critiques (which can note important downsides but don’t strongly a policy should be changed, due to a lack of alternatives) vs true evaluations (which claim that changes need to be made, and therefore should be required to evaluate alternatives).
If the starting point is incoherent, then this approach doesn’t seem like it’ll go far—if AIXI isn’t useful to study, then probably AIXItl isn’t either (although take this particular example with a grain of salt, since I know almost nothing about AIXItl).
Hm. I already think the starting point of Bayesian decision theory (which is even “further up” than AIXI in how I am thinking about it) is fairly useful.
In a naive sort of way, people can handle uncertain gambles by choosing a quantity to treat as ‘utility’ (such as money), quantifying probabilities of outcomes, and taking expected values. This doesn’t always serve very well (e.g. one might prefer Kelley betting), but it was kind of the starting point (probability theory getting its starting point from gambling games) and the idea seems like a useful decision-making mechanism in a lot of situations.
Perhaps more convincingly, probability theory seems extremely useful, both as a precise tool for statisticians and as a somewhat looser analogy for thinking about everyday life, cognitive biases, etc.
AIXI adds to all this the idea of quantifying Occam’s razor with algorithmic information theory, which seems to be a very fruitful idea. But I guess this is the sort of thing we’re going to disagree on.
As for AIXItl, I think it’s sort of taking the wrong approach to “dragging things down to earth”. Logical induction simultaneously makes things computable and solves a new set of interesting problems having to do with accomplishing that. AIXItl feels more like trying to stuff an uncomputable peg into a computable hole.
It sounds like both of you are people who don’t have experience with remembering dreams, so an opening which for me seemed very relatable didn’t land. Raemon flags his comment as ‘pedagogical note’ and David calls recalling dreams a ‘misleading example’.
But is there more to it than a starting example that didn’t connect? David brings up research which ambiguously suggests there’s a lot of confabulation around dreams. (I’m interested in references.) I had an in-person conversation with someone who read my post and thought the confabulation problem was more broadly damning.
My inside view is that if this were a very serious problem, I’d kind of be screwed. I’m having difficulty taking the position very seriously, because this is such a basic mental move. Of course at some level I’m saying “try doing more of this” and the question is “does doing more of this make things worse?”—in the world where that’s the case, we don’t want to practice this mental motion or encourage it.
Objectively, dreams are kind of a worst-case scenario, since it isn’t possible to check the reality. Subjectively, though, dreams seem to me like a really good case: I don’t always know what I made up later vs what really occurred in the dream, but I (subjectively) know when I don’t know. I often catch myself adding details, and can either tease out what was made up vs what was really there, or conclude that I can’t do so.
I initially wrote this post with the idea of starting a sequence on “rationality as a practice”, IE, trying to dig into things like this which are moment-to-moment habits of thought which one can work to improve no matter one’s current level of skill. Now my feeling is that this sort of thing is bottlenecked on empirical evidence. I would like to know whether the stuff I propose actually increases or decreases confabulation.
So, yeah, one thing that’s going on here is that I have recently been explicitly going in the other direction with partial agency, so obviously I somewhat agree. (Both with the object-level anti-realism about the limit of perfect rationality, and with the meta-level claim that agent foundations research may have a mistaken emphasis on this limit.)
But I also strongly disagree in another way. For example, you lump logical induction into the camp of considering the limit of perfect rationality. And I can definitely see the reason. But from my perspective, the significant contribution of logical induction is absolutely about making rationality more bounded.
The whole idea of the logical uncertainty problem is to consider agents with limited computational resources.
Logical induction in particular involves a shift in perspective, where rationality is not an ideal you approach but rather directly about how you improve. Logical induction is about asymptotically approximating coherence in a particular way as opposed to other ways.
So to a large extent I think my recent direction can be seen as continuing a theme already present—perhaps you might say I’m trying to properly learn the lesson of logical induction.
But is this theme isolated to logical induction, in contrast to earlier MIRI research? I think not fully: Embedded Agency ties everything together to a very large degree, and embeddedness is about this kind of boundedness to a large degree.
So I think Agent Foundations is basically not about trying to take the limit of perfect rationality. Rather, we inherited this idea of perfect rationality from Bayesian decision theory, and Agent Foundations is about trying to break it down, approaching it with skepticism and trying to fit it more into the physical world.
Reflective Oracles still involve infinite computing power, and logical induction still involves massive computing power, more or less because the approach is to start with idealized rationality and try to drag it down to Earth rather than the other way around. (That model feels a bit fake but somewhat useful.)
(Generally I am disappointed by my reply here. I feel I have not adequately engaged with you, particularly on the function-vs-nature distinction. I may try again later.)
I generally like the re-framing here, and agree with the proposed crux.
I may try to reply more at the object level later.
Yeah, I actually tried them, but didn’t personally like them that well. They could definitely be an option for someone.
(Another possibility is that you think that building AI the way we do now is so incredibly doomed that even though the story outlined above is unlikely, you see no other path by which to reduce x-risk, which I suppose might be implied by your other comment here.)
This seems like the closest fit, but my view has some commonalities with points 1-3 nonetheless.
(I agree with 1, somewhat agree with 2, and don’t agree with 3).
It sounds like our potential cruxes are closer to point 3 and to the question of how doomed current approaches are. Given that, do you still think rationality realism seems super relevant (to your attempted steelman of my view)?
My current best argument for this position is realism about rationality; in this world, it seems like truly understanding rationality would enable a whole host of both capability and safety improvements in AI systems, potentially directly leading to a design for AGI (which would also explain the info hazards policy).
I guess my position is something like this. I think it may be quite possible to make capabilities “blindly”—basically the processing-power heavy type of AI progress (applying enough tricks so you’re not literally recapitulating evolution, but you’re sorta in that direction on a spectrum). Or possibly that approach will hit a wall at some point. But in either case, better understanding would be essentially necessary for aligning systems with high confidence. But that same knowledge could potentially accelerate capabilities progress.
So I believe in some kind of knowledge to be had (ie, point #1).
Yeah, so, taking stock of the discussion again, it seems like:
There’s a thing-I-believe-which-is-kind-of-like-rationality-realism.
Points 1 and 2 together seem more in line with that thing than “rationality realism” as I understood it from the OP.
You already believe #1, and somewhat believe #2.
We are both pessimistic about #3, but I’m so pessimistic about doing things without #3 that I work under the assumption anyway (plus I think my comparative advantage is contributing to those worlds).
We probably do have some disagreement about something like “how real is rationality?”—but I continue to strongly suspect it isn’t that cruxy.
(ETA: In my head I was replacing “evolution” with “reproductive fitness”; I don’t agree with the sentence as phrased, I would agree with it if you talked only about understanding reproductive fitness, rather than also including e.g. the theory of natural selection, genetics, etc. In the rest of your comment you were talking about reproductive fitness, I don’t know why you suddenly switched to evolution; it seems completely different from everything you were talking about before.)
I checked whether I thought the analogy was right with “reproductive fitness” and decided that evolution was a better analogy for this specific point. In claiming that rationality is as real as reproductive fitness, I’m claiming that there’s a theory of evolution out there.
Sorry it resulted in a confusing mixed metaphor overall.
But, separately, I don’t get how you’re seeing reproductive fitness and evolution as having radically different realness, such that you wanted to systematically correct. I agree they’re separate questions, but in fact I see the realness of reproductive fitness as largely a matter of the realness of evolution—without the overarching theory, reproductive fitness functions would be a kind of irrelevant abstraction and therefore less real.
To my knowledge, the theory of evolution (ETA: mathematical understanding of reproductive fitness) has not had nearly the same impact on our ability to make big things as (say) any theory of physics. The Rocket Alignment Problem explicitly makes an analogy to an invention that required a theory of gravitation / momentum etc. Even physics theories that talk about extreme situations can enable applications; e.g. GPS would not work without an understanding of relativity. In contrast, I struggle to name a way that evolution(ETA: insights based on reproductive fitness) affects an everyday person (ignoring irrelevant things like atheism-religion debates). There are lots of applications based on an understanding of DNA, but DNA is a “real” thing. (This would make me sympathetic to a claim that rationality research would give us useful intuitions that lead us to discover “real” things that would then be important, but I don’t think that’s the claim.)
I think this is due more to stuff like the relevant timescale than the degree of real-ness. I agree real-ness is relevant, but it seems to me that the rest of biology is roughly as real as reproductive fitness (ie, it’s all very messy compared to physics) but has far more practical consequences (thinking of medicine). On the other side, astronomy is very real but has few industry applications. There are other aspects to point at, but one relevant factor is that evolution and astronomy study things on long timescales.
Reproductive fitness would become very relevant if we were sending out seed ships to terraform nearby planets over geological time periods, in the hope that our descendants might one day benefit. (Because we would be in for some surprises if we didn’t understand how organisms seeded on those planets would likely evolve.)
So—it seems to me—the question should not be whether an abstract theory of rationality is the sort of thing which on-outside-view has few or many economic consequences, but whether it seems like the sort of thing that applies to building intelligent machines in particular!
My underlying model is that when you talk about something so “real” that you can make extremely precise predictions about it, you can create towers of abstractions upon it, without worrying that they might leak. You can’t do this with “non-real” things.
Reproductive fitness does seem to me like the kind of abstraction you can build on, though. For example, the theory of kin selection is a significant theory built on top of it.
As for reaching high confidence, yeah, there needs to be a different model of how you reach high confidence.
The security mindset model of reaching high confidence is not that you have a model whose overall predictive accuracy is high enough, but rather that you have an argument for security which depends on few assumptions, each of which is individually very likely. E.G., in computer security you don’t usually need exact models of attackers, and a system which relies on those is less likely to be secure.
I was thinking of the difference between the theory of electromagnetism vs the idea that there’s a reproductive fitness function, but that it’s very hard to realistically mathematise or actually determine what it is. The difference between the theory of electromagnetism and mathematical theories of population genetics (which are quite mathematisable but again deal with ‘fake’ models and inputs, and which I guess is more like what you mean?) is smaller, and if pressed I’m unsure which theory rationality will end up closer to.
[Spoiler-boxing the following response not because it’s a spoiler, but because I was typing a response as I was reading your message and the below became less relevant. The end of your message includes exactly the examples I was asking for (I think), but I didn’t want to totally delete my thinking-out-loud in case it gave helpful evidence about my state.]
I’m having trouble here because yes, the theory of population genetics factors in heavily to what I said, but to me reproductive fitness functions (largely) inherit their realness from the role they play in population genetics. So the two comparisons you give seem not very different to me. The “hard to determine what it is” from the first seems to lead directly to the “fake inputs” from the second.
So possibly you’re gesturing at a level of realness which is “how real fitness functions would be if there were not a theory of population genetics”? But I’m not sure exactly what to imagine there, so could you give a different example (maybe a few) of something which is that level of real?
Separately, I feel weird having people ask me about why things are ‘cruxy’ when I didn’t initially say that they were and without the context of an underlying disagreement that we’re hashing out. Like, either there’s some misunderstanding going on, or you’re asking me to check all the consequences of a belief that I have compared to a different belief that I could have, which is hard for me to do.
Ah, well. I interpreted this earlier statement from you as a statement of cruxiness:
If I didn’t believe the above, I’d be less interested in things like AIXI and reflective oracles. In general, the above tells you quite a bit about my ‘worldview’ related to AI.
And furthermore the list following this:
Searching for beliefs I hold for which ‘rationality realism’ is crucial by imagining what I’d conclude if I learned that ‘rationality irrealism’ was more right:
So, yeah, I’m asking you about something which you haven’t claimed is a crux of a disagreement which you and I are having, but, I am asking about it because I seem to have a disagreement with you about (a) whether rationality realism is true (pending clarification of what the term means to each of us), and (b) whether rationality realism should make a big difference for several positions you listed.
I confess to being quite troubled by AIXI’s language-dependence and the difficulty in getting around it. I do hope that there are ways of mathematically specifying the amount of computation available to a system more precisely than “polynomial in some input”, which should be some input to a good theory of bounded rationality.
Ah, so this points to a real and large disagreement between us about how subjective a theory of rationality should be (which may be somewhat independent of just how real rationality is, but is related).
I think I was imagining an alternative world where useful theories of rationality could only be about as precise as theories of liberalism, or current theories about why England had an industrial revolution when it did, and no other country did instead.
Ok. Taking this as the rationality irrealism position, I would disagree with it, and also agree that it would make a big difference for the things you said rationality-irrealism would make a big difference for.
So I now think we have a big disagreement around point “a” (just how real rationality is), but maybe not so much around “b” (what the consequences are for the various bullet points you listed).
Although in some sense I also endorse the “strawman” that rationality is more like momentum than like fitness (at least some aspects of rationality).
I think that ricraz claims that it’s impossible to create a mathematical theory of rationality or intelligence, and that this is a crux, not so? On the other hand, the “momentum vs. fitness” comparison doesn’t make sense to me.
Well, it’s not entirely clear. First there is the “realism” claim, which might even be taken in contrast to mathematical abstraction; EG, “is IQ real, or is it just a mathematical abstraction”? But then it is clarified with the momentum vs fitness test, which makes it seem like the question is the degree to which accurate mathematical models can be made (where “accurate” means, at least in part, helpfulness in making real predictions).
So the idea seems to be that there’s a spectrum with physics at one extreme end. I’m not quite sure what goes at the other extreme end. Here’s one possibility:
A problem I have is that (almost) everything on the spectrum is real. Tables and chairs are real, despite not coming with precise mathematical models. So (arguably) one could draw two separate axes, “realness” vs “mathematical modelability”. Well, it’s not clear exactly what that second axis should be.
Anyway, to the extent that the question is about how mathematically modelable agency is, I do think it makes more sense to expect “reproductive fitness” levels rather than “momentum” levels.
Hmm, actually, I guess there’s a tricky interpretational issue here, which is what it means to model agency exactly.
On the one hand, I fully believe in Eliezer’s idea of understanding rationality so precisely that you could make it out of pasta and rubber bands (or whatever). IE, at some point we will be able to build agents from the ground up. This could be seen as an entirely precise mathematical model of rationality.
But the important thing is a theoretical understanding sufficient to understand the behavior of rational agents in the abstract, such that you could predict in broad strokes what an agent would do before building and running it. This is a very different matter.
I can see how Ricraz would read statements of the first type as suggesting very strong claims of the second type. I think models of the second type have to be significantly more approximate, however. EG, you cannot be sure of exactly what a learning system will learn in complex problems.
ETA: I also have a model of you being less convinced by realism about rationality than others in the “MIRI crowd”; in particular, selection vs. control seems decidedly less “realist” than mesa-optimizers (which didn’t have to be “realist”, but was quite “realist” the way it was written, especially in its focus on search).
Just a quick reply to this part for now (but thanks for the extensive comment, I’ll try to get to it at some point).
It makes sense. My recent series on myopia also fits this theme. But I don’t get much* push-back on these things. Some others seem even less realist than I am. I see myself as trying to carefully deconstruct my notions of “agency” into component parts that are less fake. I guess I do feel confused why other people seem less interested in directly deconstructing agency the way I am. I feel somewhat like others kind of nod along to distinctions like selection vs control but then go back to using a unitary notion of “optimization”. (This applies to people at MIRI and also people outside MIRI.)
*The one person who has given me push-back is Scott.
How critical is it that rationality is as real as electromagnetism, rather than as real as reproductive fitness? I think the latter seems much more plausible, but I also don’t see why the distinction should be so cruxy.
My suspicion is that Rationality Realism would have captured a crux much more closely if the line weren’t “momentum vs reproductive fitness”, but rather, “momentum vs the bystander effect” (ie, physics vs social psychology). Reproductive fitness implies something that’s quite mathematizable, but with relatively “fake” models—e.g., evolutionary models tend to assume perfectly separated generations, perfect mixing for breeding, etc. It would be absurd to model the full details of reality in an evolutionary model, although it’s possible to get closer and closer.
I think that’s more the sort of thing I expect for theories of agency! I am curious why you expect electromagnetism-esque levels of mathematical modeling. Even AIXI describes a heavy dependence on programming language. Any theory of bounded rationality which doesn’t ignore poly-time differences (ie, anything “closer to the ground” than logical induction) has to be hardware-dependent as well.
If I didn’t believe the above,
What alternative world are you imagining, though?