Sure, I’ll try to find one later today.
Another standard example here is ‘nudge’. As we all know, a nudge is like when on an organ-donor form, you have to check the box to opt-out rather than to opt-in. Lots of little nudges build up to an environment where the path of least resistance takes you a certain way (hopefully a pro-social way).
Yet I repeatedly hear the guys who wrote that book mention how the opt-in / opt-out thing isn’t an example of a nudge.
Now, I have no idea what they intended, but I sure know I want a name for that way that you cause the environment to make it easy to do one action, and so I’m using nudge for that.
Hm, good point. Will come back later and see if I can rewrite into a better question.
I think the answer to the first question is that, as with every other (important) industry, the people in that industry will have the time and skill to notice the problems and start working on them. The FOOM argument says that a small group will form a singleton quickly, and so we need to do something special to ensure it goes well, and the non-FOOM argument is that AI is an industry like most others, and like most others it will not take over the world in a matter of months.
+1 to being interested in reading this :)
Yeah, I was not saying the posts invented the terms, I was saying they were responsible for my usage of them. I remember at the time reading the post Goodhart Taxonomy and not thinking it was very useful, but then repeatedly referring back to it a great deal in my conversations. I also ended up writing a post based on the four subtypes.
Added: Local Validity and Free Energy are two other examples that obviously weren’t coined here, but the discussion here caused me to use quite a lot.
Actually in my head I was more counting the tail conversations (e.g. where I use a term 20-30 times), but you’re right that the regular conversations will count for most of the area under the curve. Slack, Goodharting, Common Knowledge, are all ones I use quite frequently.
Interesting analogy here.
Have any posts from LW 2.0 generated new conceptual handles for the community like “the sanity waterline”?
As a datapoint, here’s a few I’ve used a bunch of times in real life due to discussing them on LW (2.0). I’ve used most of these more than 20 times, and a few of them more like 2000 times.
Embedded Agency, Demon Threads, Slack, Combat vs Nurture Culture, Rationality Realism, Local Validity, Common Knowledge, Free Energy, Out to Get You, Fire Alarm, Robustness to Scale, Unrolling Social Metacognition, The Steering Problem, Goodhart’s Law.
If I put on my startup hat, I hear this proposal as “Have you considered scaling your product by 10x?” A startup is essentially a product that can (and does) scale by multiple orders of magnitude to be useful to massive numbers of people, producing a significant value for the consumer population, and if you share attributes of a startup, it’s a good question to ask yourself.
That said, many startups scale before their product is ready. I have had people boast to me about how much funding they’ve gotten for their startup, without giving me a story for how they think they can actually turn that funding into people using their product. Remember that time Medium fired 1/3rd of its staff. There are many stories of startups getting massive amounts of funding and then crashing. So you don’t want to scale prematurely.
To pick something very concrete, one question you could as is “If I told you that LW had gotten 10x comments this quarter, do you update that we’d made 10x or even 3x progress on the art of human rationality and/or AI alignment (relative to the amount of progress we made on LW the quarter before)?” I think that isn’t implausible, but I think that it’s not obvious, and I think there are other things to focus on. To give a very concrete example that’s closer to work we’ve done lately, if you heard that “LessWrong had gotten 10x answers to questions of >50 karma this quarter”, I think I’d be marginally more confident that core intellectual progress had been made, but still that metric is obviously very goodhart-able.
A second and related reason to be skeptical of focusing on moving comments from 19 to 179 at the current stage (especially if I put on my ‘community manager hat’), is a worry about wasting people’s time. In general, LessWrong is a website where we don’t want many core members of the community to be using it 10 hours per day. Becoming addictive and causing all researchers to be on it all day, could easily be a net negative contribution to the world. While none of your recommendations were about addictiveness, there are related ways of increasing the number of comments such as showing a user’s karma score on every page, like LW 1.0 did.
Anyway, those are some arguments against. I overall feel like we’re in the ‘figuring out the initial idea and product’ stage rather than the ‘execute’ stage and is where my thoughts are spent presently. I’m interested in more things like creating basic intro texts in AI alignment, creating new types of ways of knowing what ideas are needed on the site, and focusing generally on the end of the pipeline of intellectual progress right now, before focusing on getting more people spending their time on the site. I do think I’d quickly change my mind if net engagement of the site was decreasing, but my current sense is that it is slowly increasing.
Just a short note to say that CEA’s “EA Grants” programme is funded in part by OpenPhil.
The criticism is expecting counter-criticism.
I might slightly alter to one of
The critique-author commits to writing a response post 2-4 weeks later responding to the comments, or alternatively a response post 1-2 months later responding to all posts on LW with >20 karma that critique the initial post.
The summary is great, thanks a lot!
Related: Gwern wrote a post arguing that people have an incentive to build a goal-directed AI over a non-goal directed AI. See the references here.
Probabilistic forecasting (for evaluative thinking) and Fermi estimates (for generative thinking).
I definitely am not quite sure what the epistemic state of the paper is, or even its goal. Bostrom, Dafoe and Flynn keep mentioning that this paper is not a complete list of desiderata, but I don’t know what portion of key desiderata they think they’ve hit, or why they think it’s worthwhile at this stage to pre-emptively list the desiderata that currently seem important.
(Added: My top hypothesis is that Bostrom was starting a policy group with Dafoe as its head, and thought to himself “What are the actual policy implications of the work in my book?” and then wrote them down, without expecting it to be complete, just an obvious starting point.)
As to my thoughts on whether the recommendations in the paper seem good… to be honest, it all felt so reasonable and simple (added: this is a good thing). There were not big leaps of inference. It didn’t feel surprising to me. But here’s a few updates/reflections.
I have previously run the thought experiment “What would I do if I were at the start, or just before the start, of the industrial revolution?” Thought pertaining to massive turbulence, redistribution, and concentration, and adaptability, seemed natural focal concerns to me, but I had not made them as precise or as clear as the paper had. Then again I’d been thinking more about what I as an individual should do, not how a government or larger organisation should approach the problem. I definitely hadn’t thought about population dynamics in that context (which were also a big deal after the industrial revolution—places like England scaled by an order of magnitude, requiring major infrastructural changes in politics, education, industry, and elsewhere).
I think that the technical details of AI are most important in the sections on Efficiency and Population. The sections on Allocation and Process I would expect to apply to any technological revolution (industrial, agricultural, etc).
I’m not sure that this is consistent with his actions, but I think it’s likely that Ben from yesterday would’ve said the words “In order to make sensible progress on AI policy you require a detailed understanding of the new technology”. I realise now that, while it is indeed required to get the overall picture right, there is progress to be made that merely takes heed of this being a technological revolution of historic proportions, and does not need to matter too much which particular technological revolution we’re going through.
I’ve seen another discussion here, along with the Vulnerable World Hypothesis paper (LW discussion here), for the need for the ability to execute a massive coordination increase. I’m going to definitely think more about ‘conditional stabilization’, how exactly it follows from the conceptual space of thinking about singletons and coordination, and what possible things it might look like (global surveillance seems terrible on the face of it, I wonder if moving straight to that is premature. I think there’s probably a lot more granular ways of thinking about surveillance).
In general this paper is full of very cautious and careful conceptual work, based on simple arguments and technical understandings of AI and coordination. In general I don’t trust many people to do this without vetting the ideas in depth myself or without seeing a past history of their success. Bostrom certainly ticks the latter box and weakly ticks the former box for me (I’ve yet to personally read enough of his writings to say anything stronger there), and given that he’s a primary author on this paper, I feel epistemically safe taking on these framings without 30-100 hours of further examination.
I hope to be able to spend a similar effort summarising the many other strategic papers Bostrom and others at the FHI have produced.
For future posts of a similar nature, please PM me if you have any easy changes that would’ve made this post more useful to you / made it easier to get the info you needed (I will delete public comments on that topic). It’d also be great to (publicly) hear that someone else actually read the paper and checked whether my notes missed something important or are inaccurate.
Yeah, this was crossposted from Katja’s travel blog.
I think that closed-by-default is a very bad strategy from the perspective of outreach, and the perspective of building a field of AI alignment. But I realise that MIRI is explicitly and wholly focusing on making research progress, for at least the coming few years, and I think overall the whole post and decisions make a lot of sense from this perspective.
Our impression is indeed that well-targeted outreach efforts can be highly valuable. However, attempts at outreach/influence/field-building seem to us to currently constitute a large majority of worldwide research activity that’s motivated by AGI safety concerns, such that MIRI’s time is better spent on taking a straight shot at the core research problems. Further, we think our own comparative advantage lies here, and not in outreach work.
And here’s the footnotes:
 In other words, many people are explicitly focusing only on outreach, and many others are selecting technical problems to work on with a stated goal of strengthening the field and drawing others into it.
 This isn’t meant to suggest that nobody else is taking a straight shot at the core problems. For example, OpenAI’s Paul Christiano is a top-tier researcher who is doing exactly that. But we nonetheless want more of this on the present margin.
I was recently thinking about focus. Some examples:
The internet provides access to an education that the aristocracy of old couldn’t have imagined.
It also provides the perfect attack vector for marketers to exploit cognitive vulnerabilities and dominate your attention.
A world-class education is free for the undistractable.
Sam Altman’s recent blogpost on How to Be Successful has the following two commands:
3. Learn to think independently
(He often talks about the main task a startup founder has is to pick the 2 or 3 things to focus on that day of the 100+ things vying for your attention.)
And I found this old quote by the mathematician Gronthendieck on Michael Nielson’s blog.
In those critical years I learned how to be alone. [But even] this formulation doesn’t really capture my meaning. I didn’t, in any literal sense, learn to be alone, for the simple reason that this knowledge had never been unlearned during my childhood. It is a basic capacity in all of us from the day of our birth. However these three years of work in isolation [1945-1948], when I was thrown onto my own resources, following guidelines which I myself had spontaneously invented, instilled in me a strong degree of confidence, unassuming yet enduring in my ability to do mathematics, which owes nothing to any consensus or to the fashions which pass as law. By this I mean to say: to reach out in my own way to the things I wished to learn, rather than relying on the notions of the consensus, overt or tacit, coming from a more or less extended clan of which I found myself a member, or which for any other reason laid claim to be taken as an authority. This silent consensus had informed me both at the lycee and at the university, that one shouldn’t bother worrying about what was really meant when using a term like “volume” which was “obviously self-evident”, “generally known,” “in problematic” etc… it is in this gesture of “going beyond” to be in oneself rather than the pawn of a consensus, the refusal to stay within a rigid circle that others have drawn around one—it is in this solitary act that one finds true creativity. All others things follow as a matter of course.
Since then I’ve had the chance in the world of mathematics that bid me welcome, to meet quite a number of people, both among my “elders” and among young people in my general age group who were more brilliant, much more ‘gifted’ than I was. I admired the facility with which they picked up, as if at play, new ideas, juggling them as if familiar with them from the cradle—while for myself I felt clumsy, even oafish, wandering painfully up an arduous track, like a dumb ox faced with an amorphous mountain of things I had to learn (so I was assured) things I felt incapable of understanding the essentials or following through to the end. Indeed, there was little about me that identified the kind of bright student who wins at prestigious competitions or assimilates almost by sleight of hand, the most forbidding subjects.
In fact, most of these comrades who I gauged to be more brilliant than I have gone on to become distinguished mathematicians. Still from the perspective of thirty or thirty five years, I can state that their imprint upon the mathematics of our time has not been very profound. They’ve done all things, often beautiful things in a context that was already set out before them, which they had no inclination to disturb. Without being aware of it, they’ve remained prisoners of those invisible and despotic circles which delimit the universe of a certain milieu in a given era. To have broken these bounds they would have to rediscover in themselves that capability which was their birthright, as it was mine: The capacity to be alone.
Overall, it made me update that MIRI’s decision to be closed-by-default is quite sensible. This section seems trivially correct from this point of view.
Focus seems unusually useful for this kind of work
There may be some additional speed-up effects from helping free up researchers’ attention, though we don’t consider this a major consideration on its own.
Historically, early-stage scientific work has often been done by people who were solitary or geographically isolated, perhaps because this makes it easier to slowly develop a new way to factor the phenomenon, instead of repeatedly translating ideas into the current language others are using. It’s difficult to describe how much mental space and effort turns out to be taken up with thoughts of how your research will look to other people staring at you, until you try going into a closed room for an extended period of time with a promise to yourself that all the conversation within it really won’t be shared at all anytime soon.
Once we realized this was going on, we realized that in retrospect, we may have been ignoring common practice, in a way. Many startup founders have reported finding stealth mode, and funding that isn’t from VC outsiders, tremendously useful for focus. For this reason, we’ve also recently been encouraging researchers at MIRI to worry less about appealing to a wide audience when doing public-facing work. We want researchers to focus mainly on whatever research directions they find most compelling, make exposition and distillation a secondary priority, and not worry about optimizing ideas for persuasiveness or for being easier to defend.