I am Issa Rice. https://issarice.com/
Ok I see, thanks for explaining. I think what’s confusing to me is that Eliezer did stop talking about the deep math of intelligence sometime after 2011 and then started talking about big blobs of matrices as you say starting around 2016, but as far as I know he has never gone back to his older AI takeoff writings and been like “actually I don’t believe this stuff anymore; I think hard takeoff is actually more likely to be due to EMH failure and natural lag between projects”. (He has done similar things for his older writings that he no longer thinks is true, so I would have expected him to do the same for takeoff stuff if his beliefs had indeed changed.) So I’ve been under the impression that Eliezer actually believes his old writings are still correct, and that somehow his recent remarks and old writings are all consistent. He also hasn’t (as far as I know) written up a more complete sketch of how he thinks takeoff is likely to go given what we now know about ML. So when I see him saying things like what’s quoted in Rob’s OP, I feel like he is referring to the pre-2012 “deep math” takeoff argument. (I also don’t remember if Bostrom gave any sketch of how he expects hard takeoff to go in Superintelligence; I couldn’t find one after spending a bit of time.)
If you have any links/quotes related to the above, I would love to know!
(By the way, I was was a lurker on LessWrong starting back in 2010-2011, but was only vaguely familiar with AI risk stuff back then. It was only around the publication of Superintelligence that I started following along more closely, and only much later in 2017 that I started putting in significant amounts of my time into AI safety and making it my overwhelming priority. I did write several timelines though, and recently did a pretty thorough reading of AI takeoff arguments for a modeling project, so that is mostly where my knowledge of the older arguments comes from.)
Thanks! My understanding of the Bostrom+Yudkowsky takeoff argument goes like this: at some point, some AI team will discover the final piece of deep math needed to create an AGI; they will then combine this final piece with all of the other existing insights and build an AGI, which will quickly gain in capability and take over the world. (You can search “a brain in a box in a basement” on this page or see here for some more quotes.)
In contrast, the scenario you imagine seems to be more like (I’m not very confident I am getting all of this right): there isn’t some piece of deep math needed in the final step. Instead, we already have the tools (mathematical, computational, data, etc.) needed to build an AGI, but nobody has decided to just go for it. When one project finally decides to go for an AGI, this EMH failure allows them to maintain enough of a lead to do crazy stuff (conquistadors, persuasion tools, etc.), and this leads to DSA. Or maybe the EMH failure isn’t even required, just enough of a clock time lead to be able to do the crazy stuff.
If the above is right, then it does seem quite different from Paul+Katja, but also different from Bostrom+Yudkowsky, since the reason why the outcome is unipolar is different. Whereas Bostrom+Yudkowsky say the reason one project is ahead is because there is some hard step at the end, you instead say it’s because of some combination of EMH failure and natural lag between projects.
Which of the “Reasons to expect fast takeoff” from Paul’s post do you find convincing, and what is your argument against what Paul says there? Or do you have some other reasons for expecting a hard takeoff?
I’ve seen this post of yours, but as far as I know, you haven’t said much about hard vs soft takeoff in general.
(I have only given this a little thought, so wouldn’t be surprised if it is totally wrong. I’m curious to hear what people think.)
I’ve known about deductive vs inductive reasoning for a long time, but only recently heard about abductive reasoning. It now occurs to me that what we call “Solomonoff induction” might better be called “Solomonoff abduction”. From SEP:
It suggests that the best way to distinguish between induction and abduction is this: both are ampliative, meaning that the conclusion goes beyond what is (logically) contained in the premises (which is why they are non-necessary inferences), but in abduction there is an implicit or explicit appeal to explanatory considerations, whereas in induction there is not; in induction, there is only an appeal to observed frequencies or statistics.
In Solomonoff induction, we explicitly refer to the “world programs” that provide explanations for the sequence of bits that we observe, so according to the above criterion it fits under abduction rather than induction.
What alternatives to “split-and-linearly-aggregate” do you have in mind? Or are you just identifying this step as problematic without having any concrete alternative in mind?
There is a map on the community page. (You might need to change something in your user settings to be able to see it.)
I’m curious why you decided to make an entirely new platform (Thought Saver) rather than using Andy’s Orbit platform.
Messaging sounds good to start with (I find calls exhausting so only want to do it when I feel it adds a lot of value).
Ah ok cool. I’ve been doing something similar for the past few years and this post is somewhat similar to the approach I’ve been using for reviewing math, so I was curious how it was working out for you.
Have you actually tried this approach, and if so for how long and how has it worked?
So there’s a need for an intermediate stage between creating an extract and creating a flashcard. This need is what progressive highlighting seeks to address.
I haven’t actually done incremental reading in SuperMemo so I’m not sure about this, but I believe extract processing is meant to be recursive: first you extract a larger portion of the text that seems relevant, then when you encounter it again the extract itself is treated like an original article itself, so you might extract just a single sentence, then when you encounter that sentence again you might make a cloze deletion or Q&A card.
This sounds a lot like (a subset of) incremental reading. Instead of highlighting, one creates “extracts” and reviews those extracts over time to see if any of them can be turned into flashcards. As you suggest, there is no pressure to immediately turn things into flashcards on a first-pass of the reading material. These two articles about incremental reading emphasize this point. A quote from the first of these:
Initially, you make extracts because “Well it seems important”. Yet to what degree (the number of clozes/Q&As) and in what formats (cloze/Q&A/both) are mostly fuzzy at this point. You can’t decide wisely on what to do with an extract because you lack the clarity and relevant information to determine it. In other words, you don’t know the extract (or in general, the whole article) well enough to know what to do with it.In this case, if you immediately process an extract, you’ll tend to make mistakes. For example, for an extract, you should have dismissed it but you made two clozed items instead; you may have dismissed it when it’s actually very important to you, unbeknown to you at that moment. With lowered quality of metamemory judgments, skewed by all the cognitive biases, the resulting clozed/Q&A item(s) is just far from optimal.
Initially, you make extracts because “Well it seems important”. Yet to what degree (the number of clozes/Q&As) and in what formats (cloze/Q&A/both) are mostly fuzzy at this point. You can’t decide wisely on what to do with an extract because you lack the clarity and relevant information to determine it. In other words, you don’t know the extract (or in general, the whole article) well enough to know what to do with it.
In this case, if you immediately process an extract, you’ll tend to make mistakes. For example, for an extract, you should have dismissed it but you made two clozed items instead; you may have dismissed it when it’s actually very important to you, unbeknown to you at that moment. With lowered quality of metamemory judgments, skewed by all the cognitive biases, the resulting clozed/Q&A item(s) is just far from optimal.
Does life extension (without other technological progress to make the world in general safer) lead to more cautious life styles? The longer the expected years left, the more value there is in just staying alive compared to taking risks. Since death would mean missing out on all the positive experiences for the rest of one’s life, I think an expected value calculation would show that even a small risk is not worth taking. Does this mean all risks that don’t get magically fixed due to life extension (for example, activities like riding a motorcycle or driving on the highway seem risky even if we have life extension technology) are not worth taking? (There is the obvious exception where if one knows when one is going to die, then one can take more risks just like in a pre-life extension world as one reaches the end of one’s life.)
I haven’t thought about this much, and wouldn’t be surprised if I am making a silly error (in which case, I would appreciate having it pointed out to me!).
I like this tag! I think the current version of the page is missing the insight that influence gained via asymmetric weapons/institutions is restricted/inflexible, i.e. an asymmetric weapon not only helps out only the “good guys” but also constrains the “good guys” into only being able to do “good things”. See this comment by Carl Shulman. (I might eventually come back to edit this in, but I don’t have the time right now.)
The EA Forum wiki has stubs for a bunch of people, including a somewhat detailed article on Carl Shulman. I wonder if you feel similarly unexcited about the articles there (if so, it seems good to discuss this with people working on the EA wiki as well), or if you have different policies for the two wikis.
I also just encountered Flashcards for your soul.
Ah ok, that makes sense. Thanks for clarifying!
It seems to already be on LW.
Edit: oops, looks like the essay was posted on LW in response to this comment.
I’m unable to apply this tag to posts (this tag doesn’t show up when I search to add a tag).
For people who find this post in the future, Abram discussed several of the points in the bullet-point list above in Probability vs Likelihood.