[Altruist Support] LW Go Foom

In which I worry that the Less Wrong project might go horribly right. This post belongs to my Altruist Support sequence.

Every project needs a risk assessment.

There’s a feeling, just bubbling under the surface here at Less Wrong, that we’re just playing at rationality. It’s rationality kindergarten. The problem has been expressed in various ways:

And people are starting to look at fixing it. I’m not worried that their attempts—and mine—will fail. At least we’d have fun and learn something.

I’m worried that they will succeed.

What would such a Super Less Wrong community do? Its members would self-improve to the point where they had a good chance of succeeding at most things they put their mind to. They would recruit new rationalists and then optimize that recruitment process, until the community got big. They would develop methods for rapidly generating, classifying and evaluating ideas, so that the only ideas that got tried would be the best that anyone had come up with so far. The group would structure itself so that people’s basic social drives—such as their desire for status—worked in the interests of the group rather than against it.

It would be pretty formidable.

What would the products of such a community be? There would probably be a self-help book that works. There would be an effective, practical guide to setting up effective communities. There would be an intuitive, practical guide to human behavior. There would be books, seminars and classes on how to really achieve your goals—and only the materials which actually got results would be kept. There would be a bunch of stuff on the Dark Arts too, no doubt. Possibly some AI research.

That’s a whole lot of material that we wouldn’t want to get into the hands of the wrong people.

Dangers include:

  • Half-rationalists: people who pick up on enough memes to be really dangerous, but not on enough to realise that what they’re doing might be foolish. For example, building an AI without adding the friendliness features.

  • Rationalists with bad goals: Someone could rationally set about trying to destroy humanity, just for the lulz.

  • Dangerous information discovered: e.g. the rationalist community develops a Theory of Everything that reveals a recipe for a physics disaster (e.g. a cheap way to turn the Earth into a block hole). A non-rationalist decides to exploit this.

If this is a problem we should take seriously, what are some possible strategies for dealing with it?

  1. Just go ahead and ignore the issue.

  2. The Bayesian Conspiracy: only those who can be trusted are allowed access to the secret knowledge.

  3. The Good Word: mix in rationalist ideas with do-good and stay-safe ideas, to the extent that they can’t be easily separated. The idea being that anyone who understands rationality will also understand that it must be used for good.

  4. Rationality cap: we develop enough rationality to achieve our goals (e.g. friendly AI) but deliberately stop short of developing the ideas too far.

  5. Play at rationality: create a community which appears rational enough to distract people who are that way inclined, but which does not dramatically increase their personal effectiveness.

  6. Risk management: accept that each new idea has a potential payoff (in terms of helping us avoid existential threats) and a potential cost (in terms of helping “bad rationalists”). Implement the ideas which come out positive.

In the post title, I have suggested an analogy with AI takeoff. That’s not entirely fair; there is probably an upper bound to how effective a community of humans can be, at least until brain implants come along. We’re probably talking two orders of magnitude rather than ten. But given that humanity already has technology with slight existential threat implications (nuclear weapons, rudimentary AI research), I would be worried about a movement that aims to make all of humanity more effective at everything they do.