Discussion about AI Safety funding (FB transcript)

Kat Woods recently wrote a Facebook post about Nonlinear’s new funding program.

This led to a discussion (in the comments section) about funding norms, the current funding bar, concerns about lowering the bar, and concerns about the current (relatively centralized) funding situation.

I’m posting a few of the comments below. I’m hoping this might promote more discussion about the funding landscape. Such discussion could be especially valuable right now, given that:

  • Many people are starting to get interested in AI safety (including people who are not from the EA/​rationalist communities)

  • AGI timeline estimates have generally shortened

  • Investment in overall AI development is increasing quickly

  • There may be opportunities to spend large amounts of money in the upcoming year (e.g., scalable career transition grant programs, regranting programs, 2024 US elections, AI governance/​policy infrastructure, public campaigns for AI safety).

  • Many ideas with high potential upside also have noteworthy downside risks (phrased less vaguely, I think that among governance/​policy/​comms projects that have high potential upside, >50% also have non-trivial downside risks).

  • We might see pretty big changes in the funding landscape over the next 6-24 months

    • New funders appear to be getting interested in AI safety

    • Governments are getting interested in AI safety

    • Major tech companies may decide to invest more resources into AI safety

Selected comments from FB thread

Note: I’ve made some editorial decisions to keep this post relatively short. Bolding is added by me. See the full thread here. Also, as usual, statements from individuals don’t necessarily reflect the views of their employers.

Kat Woods (Nonlinear)

I often talk to dejected people who say they tried to get EA funding and were rejected

And what I want to do is to give them a rousing speech about how being rejected by one funder doesn’t mean that their idea is bad or that their personal qualities are bad.

The evaluation process is noisy. Even the best funders make mistakes. They might just have a different world model or value system than you. They might have been hangry while reading your application.

That to succeed, you’ll have to ask a ton of people, and get a ton of rejections, but that’s OK, because you only need a handful of yeses.

(Kat then describes the new funding program from Nonlinear. TLDR: People submit an application that can then be reviewed by a network of 50+ funders.)

Claire Zabel (Program Officer at Open Philanthropy)

Claire’s comment:

(Claire quoting Kat:) The evaluation process is noisy. Even the best funders make mistakes. They might just have a different world model or value system than you. They might have been hangry while reading your application.

(Claire’s response): That’s true. It’s also possible the project they are applying for is harmful, but if they apply to enough funders, eventually someone will fund the harmful project (unilateralist’s curse). In my experience as a grantmaker, a substantial fraction (though certainly very far from all) rejected applications in the longtermist space seem harmful in expectation, not just “not cost-effective enough”

Selected portions of Kat’s response to Claire:

1. We’re probably going to be setting up channels where funders can discuss applicants. This way if there are concerns about net negativity, other funders considering it can see that. This might even lead to less unilateralist curse because if lots of funders think that the idea is net negative, others will be able to see that, instead of the status quo, where it’s hard to know what other funders think of an application.

2. All these donors were giving anyways, with all the possibilities of the unilateralist’s curse. This just gives them more /​ better options to choose from. From this alone, it actually might lead to less net-negative projects being funded because smaller funders have access to better options.

4. Big EA funders are also composed of fallible humans who might also miss large downside risk projects. This system could help unearth downside risks that they hadn’t thought of. There’s also the possibility of false alarms, where they thought something was net negative when it wasn’t.

Given how hard it is to evaluate ideas/​talent in in AI safety, I think we get better outcomes if we treat it less like bridge-building and more like assessing startups. Except harder! At least YCombinator finds out years later if any of their bets worked. With AI safety, we can’t even agree if Eliezer or Paul are net positive!

Larissa’s response to Claire:

Over time I’ve become somewhat skeptical of people who talk about the harm from other people’s projects in this way. It seems like it is used as an argument to centralize decision making to a small group and one that at this point I’m not sure has a strong enough track record. In the EA movement building space, the people I heard this from the most are the people who’ve now themselves caused the most harm to the EA movement. I think it’s plausible that a similar thing is true the in AI spaces.

Kerry’s response to Claire:

Historically, this kind of argument has been weaponized to centralize funding in the hands of Open Phil and Open Phil-aligned groups. I think it’s important that funding on AI-related topics not be centralized in this way as Open Phil is a major supporter of AGI danger labs via support for Open AI and more recently the strong connections to Anthropic.

Let’s inform donors about the risk that grants could cause harm and trust them to make sensible decisions on that basis.

Caleb Parikh (Executive Director of EA Funds)

I’d be interested in hearing examples of good AI safety projects that failed to get funding in the last year. I think the current funders are able to fund things down to the point where a good amount of things being passed on are net negative by their lights or have pretty low upside.

Kat’s response to Caleb:

The most common examples are people who get funding but aren’t fully funded. This happens all the time. This alone means that the Nonlinear Network can add value.

I think for the ones where the funders feel like there isn’t a ton of upside, that pretty straightforwardly should still have other people considering funding them. The big funders will often be wrong. Not because they aren’t amazing at their jobs (which I think they are), but because of the nature of the field. Successful investors miss opportunities all the time, and we should expect the nonprofit world to be worse at this because of even worse feedback loops, different goals, and a very inefficient market.

And even for the people who do think that an idea is net negative—how confident are they in that? You’d have to be quite confident that something is bad to think that other people shouldn’t even be able to think for themselves about the idea. That level of confidence in a field like AI safety seems unwarranted.

Especially given that if you asked 100 informed, smart, and value aligned EAs, you’d rarely get over 50% of people thinking it’s net negative. It’s really hard to get EAs to agree on anything! And for most of the ideas that some people think are net negative, you’d have a huge percentage of EAs thinking it’s net positive.

Because evaluating impact is *hard*. We should expect to be wrong most of the time. And if that’s the case, it’s better to harness the collective wisdom of a larger number of EAs, to have a ton of uncorrelated minds trying different strategies and debating and trying to seek truth. To not be overly confident in our own ability to evaluate talent/​ideas, and to encourage a thriving marketplace of EA ideas.

Thomas Larsen’s response to Caleb

(note: Thomas is a grant evaluator at EA funds):

Ways to turn $$$ into impact that aren’t happening:

1. Funding CAIS more (to pay their people more, to hire more people, etc)

2. Funding another evals org

3. Creating a new regranting program

4. Increasing independent alignment researcher salary to like 150k/​year (depending on location) to enable better time money tradeoffs.

5. Just decreasing the funding bar for passive applications—a year ago the funding bar was lower, and there are grants that would have been funded (and are confidently net positive EV) yet are below the current bar

Seems to me that if EA has 10B in the bank, and timelines are 10 years, it’s not unreasonable to spend 1B /​ year right now, and my guess is we currently spend ~100-200M /​ year.

Akash (that’s me)

Lots of the comments so far seem to be about the funding bar; I think there’s also a lot to be said about barriers to applying, missed opportunities, and the role of active grantmaking.

For instance, I got the sense that many of the valuable FTX regrants were grants that major funders would have funded. So sometimes people would say things like “this isn’t counterfactual because LTFF would’ve just funded it.”

But in many cases, the grants *were* counterfactual, because the grantee would’ve never thought to apply. The regranting program did lower the bar for funding, but it also created a culture of active grantmaking, proactively finding opportunities, and having people feel like it was their responsibility to find ways to turn money into impact.

My impression is that LTFF/​OP spends a rather small fraction of time on active grantmaking. I don’t have enough context to be confident that this is a mistake, but I wouldn’t be surprised if most of the value being “left on the table” was actually due to lack of active grantmaking (as opposed to EG the funding bar being too high).

Things LTFF/​OP could do about this:

1. Have more programs/​applications that appeal to particular audiences (e.g., “mech interp fund”, “career transition fund”, “AIS Hub Travel Grant”

2. Regranting program

3. More public statements around what kind of things they are interested in funding, especially stuff that lots of people might not know about. I think there’s a lot of Curse of Knowledge going on, where many grantees don’t know that they’re allowed to apply for X.

4. Hiring someone friendly/​social/​positive-vibesy to lead active grantmaking. Their role would be to go around talking to people and helping them brainstorm ways they could turn money into impact.

5. Have shorter forms where people can express interest. People find applying to LTFF/​OP burdensome, no matter how much people try to say it’s supposed to be unburdensome. Luke’s recent “interest form for AI governance” seems like a good template IMO.

Note: Since Kat’s post is public, I didn’t ask for permission to post peoples’ comments on LessWrong. I think this is the right policy, but feel free to DM me if you disagree.

Crossposted from EA Forum (102 points, 10 comments)