Right from the title of his book 23 things they don’t tell you about capitalism, it’s clear that Ha-Joon Chang is pitting himself against the establishment.
Yay! Bravery debate!
In retrospect, I think this comment of mine didn’t address Jan’s key point, which is that we often form intuitions/emotions by running a process analagous to aggregating data into a summary statistic and then throwing away the data. Now the evidence we saw is quite incommunicable—we no longer have the evidence ourselves.
Ray Arnold gave me a good example the other day of two people—one an individualist libertarian, the other a communitarian Christian. In the example these two people deeply disagree on how society should be set up, and this is entirely because they’re two identical RL systems built on different training sets (one has repeatedly seen the costs of trying to trust others with your values, and the other has repeatedly seen it work out brilliantly). Their brains have compressed the data into a single emotion, that they feel in groups trying to coordinate (say). Overall they might be able to introspect enough to communicate the causes of their beliefs, but they might not—they might just be stuck this way (until we reach the glorious transhumanist future, that is). Scott might expect them to say they just have fundamental value differences.
I agree that I have not in the OP given a full model of the different parts of the brain, how they do reasoning, and which parts are (or aren’t) in principle communicable or trustworthy. I at least claim that I’ve pointed to a vague mechanism that’s more true than the simple model where everyone just has the outputs of their beliefs. There are important gears that are hard-but-possible to communicate, and they’re generally worth focusing on over and above the credences they output. (Will write more on this in a future post about Aumann’s Agreement Theorem.)
The command to spread observations rather than theories is a valuable one. I often find it to result in far less confusion.
That said, I’m not sure it follows from a paper. You say it implies
Passing along theories can actually make both understanding and performance worse.
This wasn’t found. Here are the two relevant screenshots (the opening line is key, and then the graph also shows no real difference).
Passing along theories didn’t make things worse, the two groups (data vs data+theory) were equally good.
The first half of this reminded me of the first section from Scott’s Nonfiction Writing Advice (which I think about often when doing my own writing).
The core point:
1. Divide things into small chunks
Nobody likes walls of text. By this point most people know that you should have short, sweet paragraphs with line breaks between them. The shorter, the better. If you’re ever debating whether or not to end the paragraph and add a line break, err on the side of “yes”.
Finishing a paragraph or section gives people a micro-burst of accomplishment and reward. It helps them chunk the basic insight together and remember it for later. You want people to be going – “okay, insight, good, another insight, good, another insight, good” and then eventually you can tie all of the insights together into a high-level insight. Then you can start over, until eventually at the end you tie all of the high-level insights together. It’s nice and structured and easy to work with. If they’re just following a winding stream of thought wherever it’s going, it’ll take a lot more mental work and they’ll get bored and wander off.
Remember that clickbait comes from big media corporations optimizing for easy readability, and that the epitome of clickbait is the listicle. But the insight of the listicle applies even to much more sophisticated intellectual pieces – people are much happier to read a long thing if they can be tricked into thinking it’s a series of small things.
Meta: Eliezer previously made a quick attempt to summarise similar points in this SSC comment 2 years ago.
I’ve curated this post. Here’s the reasons that were salient to me:
The post clearly communicated (to me) some subtle intuitions about what useful research looks like when you’re deeply confused about basic principles.
The question of “How would I invent calculus if I didn’t know what calculus was and I was trying to build a plane” and finding some very specific and basic problems that cut to the heart of what we’d be confused about in that situation like “how to fire a cannonball such that it forever orbits the earth”, point to a particular type of thinking that I can see being applied to understanding intelligence, and potentially coming up with incredibly valuable insights.
I had previously been much more confused about why folks around here have been asking many questions about logical uncertainty and such, and this makes it much clearer to me.
As usual, well-written dialogues like this are very easy and fun to read (which for me trades off massively with the length of the post).
Quick thoughts on further work that could be useful:
The post gives me a category of ‘is missing some fundamental math like calculus’ that applies to rocket alignment and AI alignment. I would be interested in some examples of how to look at a problem and write down a simple story of how it works, including
some positive and negative examples—situations where it works, situations where it doesn’t
what steps in particular seem to fail on the latter
what signs say that the math should exist (e.g. some things are just complicated with few simple models predicting it—I expect a bunch of microbiology looks more like this than, say, physics).
I would also be interested to read historical case studies of this thing happening and what work lead to progress here—Newton, Shannon, others.
For a datapoint on general ignorance in this domain, I have never noticed the effects of caffeine on my energy levels, even though I regularly drink energy drinks because my housemates buy lots of them. (I don’t think I’m unusually bad at introspection either.)
I mean, analogies don’t have to be similar in all respects to be useful explanations, just in the few respects that you’re using the analogy for. OP isn’t arguing that AI alignment is important because rocket alignment is important, it’s only using the analogy to describe the type of work that it thinks needs to be done to align AGI—which I’m guessing has been difficult to describe before writing this post. Arguments that AGI needs to be built right the first time have been discussed elsewhere, and you’re right that this post doesn’t make that arg.
(On this side-topic of whether AGI needs to be build precisely right first time, and counter to your point that we-always-get-stuff-wrong-a-bunch-at-first-and-that’s-fine, I liked Max Tegmark’s story of how we’re building technologies that increasingly have less affordance for error—fire, nukes, AGI—some of these having a few mistakes was of small damage, then of big damage, and in principle we may hit tech where initial mistakes are existential in nature. I think there are some sane args that make AGI seems like a plausible instance of this.
For discussion of the AI details I’d point elsewhere, to things like Gwern on “Why Tool AIs Want to be Agent AIs”, Paul Christiano discussing arguments for fast-takeoff speeds, the paper Intelligence Explosion Microeconomics, and of course Bostrom’s book.)
(edited a few lines to be clearer/shorter)
Woop woop! Private messaging is *great*. I can let people know about spelling errors, take a potentially confrontational conversation private, or just give people information that I think they might want without clogging up the comments.
I’ve already started using this loads.
Thanks for all of your work, especially adding the AI Alignment Forum view.
Lol, seems fine ;)
I hope we all learned a valuable lesson here today.
Actually, the emphasis is a little off.
The point isn’t that anyone sane would push the button. It’s that we as a civilisation are just going around building buttons (cf. nukes, AGI, etc) and so it’s good practice to put ourselves in the situation where any unilateralist can destroy something we all truly value. When I said the above, I was justifying why it was useful to have a ritual around Petrov Day, not why you would press the button. I can’t think of any good reason to press the button, and would be angry at anyone who did—they’re just decreasing trust and increasing fear of unilateralists. We still should have a ceremony where we all practice the art of sitting together and not pressing the button.
Firstly, I hadn’t heard the joke before, and it made me chuckle to myself.
Secondly, I loved this comment, for very accurately conveying the perspective I felt like ricraz was trying to defend wrt realism about rationality.
Let me say two (more) things in response:
Firstly, I was taking the example directly from Eliezer.
I said, “So if I make an Artificial Intelligence that, without being deliberately preprogrammed with any sort of script, starts talking about an emotional life that sounds like ours, that means your religion is wrong.”
He said, “Well, um, I guess we may have to agree to disagree on this.”
I said: “No, we can’t, actually. There’s a theorem of rationality called Aumann’s Agreement Theorem which shows that no two rationalists can agree to disagree. If two people disagree with each other, at least one of them must be doing something wrong.”
(Sidenote: I have not yet become sufficiently un-confused about AAT to have a definite opinion about whether EY was using it correctly there. I do expect after further reflection to object to most rationalist uses of the AAT but not this particular one.)
Secondly, and where I think the crux of this matter lies, is that I believe your (quite understandable!) objection applies to most attempts to use bayesian reasoning in the real world.
Suppose one person is trying to ignore a small piece of evidence against a cherished position, and a second person says to the them “I know you’ve ignored this piece of evidence, but you can’t do that because it is Bayesian evidence—it is the case that you’re more likely to see this occur in worlds where your belief is false than in worlds where it’s true, so the correct epistemic move here is to slightly update against your current belief.”
If I may clumsily attempt to wrangle your example to my own ends, might they not then say:
“I mean… just what, exactly, did you think I meant, when I said this wasn’t any evidence at all? Did you take me to be claiming that (a) I am an ideal Bayesian reasoner, and (b) I have observed evidence that occurs in more worlds where my belief is true than if it is false, but (c) my posterior probability, after learning this, should still equal my prior probability? Is that what you thought I was saying? Really? But why? Why in the world did you interpret my words in such a bizarrely technical way? What would you say is your estimate that I actually meant to make that specific, precisely technical statement?”
I am not a rational agent. I am a human, and my mind does not satisfy the axioms of probability theory; therefore it is nonsensical to attempt to have me conform my speech patterns and actions to these logical formalisms.
Bayes’ theorem applies if your beliefs update according to very strict axioms, but it’s not at all obvious to me that the weird fleshy thing in my head currently conforms to those axioms. Should I nonetheless try to? And if so, why shouldn’t I for AAT?
Aumann’s Agreement Theorem is true if we are rational (bayesian) agents. There a large other number of theorems that apply to rational agents too, and it seems that sometimes people want to use these abstract formalisms to guide behaviour and sometimes not, and having a principled stance here about when and when not to use them seems useful and important.
I think I want to split up ricraz’s examples in the post into two subclasses, defined by two questions.
The first asks, given that there are many different AGI architectures one could scale up into, are some better than others? (My intuition is both that there are better ones than others, and also that there are many who are on the pareto frontier.) And is there any sort of simple ways to determine about why one is better than another? This leads to saying the following examples from the OP:
There is a simple yet powerful theoretical framework which describes human intelligence and/or intelligence in general; there is an “ideal” decision theory; the idea that AGI will very likely be an “agent”; the idea that Turing machines and Kolmogorov complexity are foundational for epistemology; the idea that morality is quite like mathematics, in that there are certain types of moral reasoning that are just correct.
The second asks—suppose that some architectures are better than others, and suppose there are some simple explanations about why some are better than others. How practical is it to talk of me in this way today? Here’s some concrete examples of things I might do:
Given certain evidence for a proposition, there’s an “objective” level of subjective credence which you should assign to it, even under computational constraints; the idea that Aumann’s agreement theorem is relevant to humans; the idea that defining coherent extrapolated volition in terms of an idealised process of reflection roughly makes sense, and that it converges in a way which doesn’t depend very much on morally arbitrary factors; the idea that having having contradictory preferences or beliefs is really bad, even when there’s no clear way that they’ll lead to bad consequences (and you’re very good at avoiding dutch books and money pumps and so on).
If I am to point to two examples that feel very concrete to me, I might ask:
Is the reasoning that Harry is doing in Chapter 86: Multiple Hypothesis Testing useful or totally insane?
When one person says “I guess we’ll have to agree to disagree” and the second person says “Actually according to Aumann’s Agreement Theorem, we can’t” is the second person making a type error?
Certainly the first person is likely mistaken if they’re saying “In principle no exchange of evidence could cause us to agree”, but perhaps the second person is also mistaken, in implying that it makes any sense to model their disagreement in terms of idealised, scaled-up, rational agents rather than the weird bag of meat and neuroscience that we actually are—for which Aumann’s Agreement Theorem certainly has not been proven.
To be clear: the two classes of examples come from roughly the same generator, and advances in our understanding of one can lead to advances in the other. I just often draw from fairly different reference classes of evidence for updating on them (examples: For the former, Jaynes, Shannon, Feynman. For the latter, Kahneman & Tversky and Tooby & Cosmides).
Seems like some other people want to bring back the Sabbath.
Yup, see Ozy’s post Against Steelmanning and Eliezer’s fb post agreeing that starts “Be it clear: Steelmanning is not a tool of understanding and communication.”
Also came here to recommend the practice of firewalling the optimal from the rational ;)
(And thanks for the post—it turns out I was using a microphone sub-optimally earlier this week!)
Normal explanation for these things is simply that people are busy, and good communication takes a lot of time. Of course, with this being the internet, it’s also the case that responses get easily misinterpreted which means you have to put in more time, further disincentivising response.
Added: Note that LW isn’t a project any of the CFAR team work on so they wouldn’t naturally be checking LW or something or be trying to use the platform to talk about their research, in case you were confused and expecting them to be actively interested in discussion. They’ve got jobs and public discussion mostly isn’t one of them right now, I think.