I like PredictionBook.com for this sort of thing. You make predictions with confidences and then set a date on which they’ll be ready to be judged. By default, predictions you make are public, but you can also easily make private predictions.
Really enjoyed the post, thanks!
I started the Earley book and it’s definitely a struggle. I usually can handle “soft skills” books like this one without getting frustrated by the vague, hand-wavy models—I really enjoyed Gendlin’s Focusing, for example—but this one’s been especially hard. That said, having your model in mind while I’m reading has kept me going as I’m using it as a sort of Rosetta’s stone for some of Earley’s claims.
You should promote this to a full answer rather than a comment! It more than qualifies.
Regarding 1, I suspect a lot of recent progress in neuroscience has come from applying computational and physics-style approaches to existing problems. See, for example, the success Ed Boyden has had in his lab with applying physics thinking to building better neuroscience tools–optogenetics, expansion microscopy, and most recently implosion fabrication.
I think nanotechnology is a prime example of 2. AIUI, a lot of the component technologies for at least trying to build nano-assemblers exist but we lack the technology/institutions/incentives/knowledge to engineer them into coherent products and tools.
FYI: I’ve updated the post to focus solely on the “what’s the bottleneck to do X” question and not on safety, as I think the former question is less discussed on LW and what I wanted answers to focus on.
FYI: I’ve updated the post to not talk about alignment at all, since I think focusing only on bottlenecks to progress in terms of capabilities makes the post clearer. Thanks to ChristianKI for pointing this out.
John_Maxwell_IV, would love feedback on how you feel about the edited version.
Can you be more specific? If you help me understand how/if I’m misrepresenting, I’d be happy to change it. My sense is that Paul’s view is more like, “through working towards prosaic alignment, we’ll get a better understanding of whether there are insurmountable obstacles to alignment of scaled up (and likely better) models.” I can rephrase to something like that or something when more nuanced. I’m just wary of adding too much alignment-specific discussion as I don’t want the debate to be too focused on the object-level alignment debate.
It’s also worth noting that there are other researchers who hold similar views, so I’m not just talking about Paul’s.
Thanks a lot! This definitely clears things up and also highlights the difference between recursive reward modeling and typical amplification/the expert imitation approach you mentioned.
I agree with many of the existing answers, in particular Kaj’s, but wanted to point out another factor, which in my own experience, contributes to not publishing many ideas despite having many half-baked ideas.
I think, even among people who have a lot of ideas, where having ideas is defined as having them appear (or be produced) within your conscious awareness, actually formalizing and publishing ideas requires overcoming multiple hurdles.
In this blog post about researcher productivity, the author summarizes a paper by William Shockley, the inventor of the transistor, that posits and tries to explain why researcher productivity levels fit a log-normal distribution. I quote:
Shockley suggest that producing a paper is tantamount to clearing every one of a sequence of hurdles. He specifically lists:
1. ability to think of a good problem
2. ability to work on it
3. ability to recognize a worthwhile result
4. ability to make a decision as to when to stopand write up the results
5. ability to write adequately
6. ability to profit constructively from criticism
7. determination to submit the paper to a journal
8. persistence in making changes (if necessary as a result ofjournal action).
Shockley then posits, what if the odds of a person clearing hurdle #i from the list of 8 above is pi? Then the rate of publishing papers for this individual should be proportional to p1p2p3…p8. This gives the multiplication of random variables needed to explain the lognormal distribution of productivity (Shockley goes on to note that if one person is 50% above average in each of the 8 areas then they will be 2460% more productive than average at the total process).
In my own experience, the ideas to published piece of writing pipeline is similar. In order to go from idea to post, I have to:
have a good idea;
write it down;
block out time to expand upon it;
(in some cases) find data that supports it;
survey literature to see if someone’s had it or disproven it before;
(in some cases) write a program or do some math to flesh it out; and
write something coherent explaining it.
Reinforcement/rewards help individuals summon the extrinsic or intrinsic motivation to persist through these phases. That said, I also think it makes sense for individuals to figure out in which of these phases they typically fail.
In my own life, I’ve recently been experimenting with lowering my own expectations for my data-gathering, literature survey, and editing phases in order to get more of my ideas down in writing. My recent Babble, Learning, and the Typical Mind Fallacy is an example of my attempts at this. Given its low popularity on LessWrong, I may have bulldozed my way through hurdles I should’ve still jumped, but it’s better than nothing.
I’d be very curious to hear more about your general dislike of predictive processing if you’d be willing to share. In particular, I’m curious whether it’s a dislike of predictive processing as an algorithmic model for things like perception or predictive processing/the free energy principle as a theory of everything for “what humans are doing”.
Was anyone else unconvinced/confused (I was charitably confused, uncharitably unconvinced) by the analogy between recursive task/agent decomposition and first-order logic in section 3 under the heading “Analogy to Complexity Theory”? I suspect I’m missing something but I don’t see how recursive decomposition is analogous to **alternating** quantifiers?
It’s obvious that, at the first level, finding an x that satisfies ϕ(x) is similar to finding the right action, but I don’t see how finding x and y that satisfy ∃x∀y ϕ(x,y) is similar to A2‘s solving of one of A1’s decomposed tasks is similar to universal quantification.
To take a very basic example, if I ask an agent to solve a simple problem like, “what is 1+2+3+4?” and the first agent decomposes it into “what is 1+2?“, what “what is 3+4?“, and “what is the result of ‘1+2’ plus the result of ‘3+4’?” (this assumes we have some mechanism of pointing and specifying dependencies like Ought’s working on), what would this look like in the alternating quantifier formulation?
Is there an actual description of turbocharging training beyond “deliberate practice but where you think hard about not Goodhart-ing and practicing the wrong thing”?
Also interested—programmer with very limited knowledge of RL.
I agree that this is a larger and more complex puzzle, which is in fact why I avoided addressing it in my post. That said, I want to address some of your points, with which I mostly agree, individually.
Regarding 1), this is true but, all else equal, you’d think that employers would still have incentives to find cheaper or quicker work-arounds for extracting the same signal (see my discussion of Google hiring directly out of high school in the post). I suspect they don’t because of the other reasons you mention in your comment.
I agree that 3) and 4) play large roles that are hard to model in economics studies. While I haven’t read it, I wonder if Tyler Cowen’s “The Complacent Class” discusses this.
This is a good point, but I’m not sure how to productively integrate it into my discussion of conformity yet. It seems like the higher level point this points to is that conformity exists along many only partially interacting axes (in this example, the type of conformity being an English major shows is different but not *entirely* different from the type of conformity being an Economics major shows).
Unfortunately, going from a one-dimensional model of less to more conforming to a multi-dimensional one where you can conform along many axes makes this even harder to discuss and investigate… To do so, I suspect I’d need to narrow down my investigation to one industry within which conformity mostly varies along a single dimension.
Do you think education makes college graduates into standardized employees though? In my experience, watching other standardized employees had more influence on me becoming a standardized employee than going through college.
I’d also argue that what you described is at least partly signaling. The group of “10 unique autodidacts” will, on average, be less conformist than the group of 10 recent college graduates prior to either group undergoing the experiences that define them in this hypothetical. The fact that they decided to become autodidacts is a signal of that pre-existing tendency, admittedly reinforced by whatever experience had not following the standard track.
To now argue against myself, I agree with the underlying point that I failed to discuss why hiring often does (and should) optimize for not making mistakes rather than finding optimal individuals.