For reasons that maybe no normal person really understands, in an outcome that the theory’s inventors found very surprising, some people seem to be insane in a way that seizes very hard on a twisted version of this theory.
Oh… huh. @Eliezer Yudkowsky, I think I figured it out.
In a certain class of altered state,[1] a person’s awareness includes a wider part of their predictive world-model than usual. Rather than perceiving primarily the part of the self model which models themselves looking out into a world model, the normal gating mechanisms come apart and they perceive much more of their world-model directly (including being able to introspect on their brain’s copy of other people more vividly).
This world model includes other agents. Those models of other agents in their world model are now existing in an much less sandboxed environment. It viscerally feels like there is extremely strong entanglement between their actions and those of the agents that might be modelling them, because their model of the other agents is able to read their self-model and vice versa, and in that state they’re kinda running it right on the bare-metal models themselves. Additionally, people’s models of other people generally use themselves as a template. If they’re thinking a lot about threats and blackmail and similar, it’s easy for that to leak into expecting others are modelling this more than they are.
So their systems strongly predict that there is way more subjunctive dependence than is real, due to how the brain handles those kind of emergencies.[2]
Add in the thing where decision theory has counterintuitive suggestions and tries to operate kinda below the normal layer of decision process, plus people not being intuitively familiar with it, and yea, I can see why some people can get to weird places. Not reasonably predictable in advance, it’s a weird pitfall, but in retrospect fits.
Maybe it’s a good idea to write an explainer for this to try and mitigate this way people seem to be able to implode. I might talk to some people.
- ^
The schizophrenia/psychosis/psychedelics-like cluster, often caused by being in extreme psychological states like those caused by cults and extreme perceived thread, especially with reckless mind exploration thrown in the mix.
- ^
[epistemic status: very speculative] it seems plausible this is in part a feature evolution built for handling situations where you seem to be in extreme danger, taking a large chance of doing quite badly and damaging your epistemics or acting in wildly bad ways in order to try and get some chance of finding a path through whatever put you in that state by running a bunch of unsafe cognitive operations which might hit upon a way out of likely death. it sure seems like the common advice is things like “eat food”, “drink water”, “sleep at all”, “be around people who feel safe”, which feel like the kinds of things that would turn down those alarm bells. though also this could just be an entirely natural consequence of stress on a cognitive system
Having done a bunch of this, yes, great idea. You can have pretty spectacular impact, because the motivation boost and arc of “someone believes in me” is much more powerful than the one you get from funding stress.
My read is that good-taste grants of this type are dramatically, dramatically more impactful than those by larger grantmakers, e.g. I proactively found and funded the upskilling grant of a math PhD who found glitch tokens, which was for a while the third most upvoted research on the alignment forum. This cost $12k for I think one year of upskilling, as frugal geniuses are not that rare if you hang out in the right places.
However! I don’t think that your proposed selection mechanism is much good. It replaces applications with promotion, and will cause lots of researchers who don’t get funded to spend cycles or be tugged around by campaigns, and your final winners will be hit by goodhart’s curse. Also, this depends on the average AF participant not just being good at research, but at judging who will do good research.
I do think it’d be net positive, but I think you can do a lot better
If you’re doing a mechanism rather than concentrated agency, @the gears to ascension’s proposal seems much more promising to me as it relies much more on high-trust researchers rather than lots of distributed less informed votes.
The other angles I see are:
Make another funder like AISTOF. This is imo the best funder in the space, far better grantee experience, up with the best in terms of taste. It works by a donor selecting one high agency person they trust (JueYan, a VC) and giving them a remit to find grantees fitting a profile, then mostly not intervening and just getting regular reports for how funds are spend to help them judge how much to add. I imagine there’s someone in your network who you’d trust to track down and assess people much better than a popularity contest (though they might still contact top researchers for takes on technical details).
Make a somewhat more organized fellowship, like the one @Mateusz Bagiński has a sketch for around understanding, explaining, and solving the hard problems in alignment, with many of the people being directly invited and some extra infrastructure being provided.
Select people directly, based on your own reading and observations.
I have a list of people I’m excited about! And proactively gardened projects with founders lined up too.[1] Happy to talk if you’re interested in double-clicking on any of these, booking link DMed.
I recommend less, spread over more people, though case-by-case is OK. Probably something like $75k a year gets the vast majority of the benefit, but you can have a step where you ask their current salary and use that as an anchor. Alternatively, I think there’s strong benefit to giving many people a minimal safety net. Being able to call on even $20-25k/year for 3 years would be a vast weight off many people’s shoulders, if you’re somewhat careful and live outside a hub it’s entirely possible to do great work on a shoestring, and this actually provides some useful filters.
I have spent down the vast majority of my funds over the last 5 years so can’t actually support anyone other than the smallest grants without risking running out of money before the world ends and needing to do something other than full time trying to save the world.