UPDATE 2023/09/13:
Including only money that has already landed in our bank account and extremely credible donor promises of funding, LTFF has raised ~1.1M and EAIF has raised ~500K. After Open Phil matching, this means LTFF now has ~3.3M additional funding and EAIF has ~1.5m in additional funding.
From my (Linch)’s perspective, this means both LTFF nor EAIF are no longer very funding constrained for the time period we wanted to raise money for (the next ~6 months), however both funds are still funding constrained and can productively make good grants with additional funding.
See this comment for more details.
Summary
EA Funds aims to empower thoughtful individuals and small groups to carry out altruistically impactful projects—in particular, enabling and accelerating small/medium-sized projects (with grants <$300K). We are looking to increase our level of independence from other actors within the EA and longtermist funding landscape and are seeking to raise ~$2.7M for the Long-Term Future Fund and ~$1.7M for the EA Infrastructure Fund (~$4.4M total) over the next six months.
Why donate to EA Funds? EA Funds is the largest funder of small projects in the longtermist and EA infrastructure spaces, and has had a solid operational track record of giving out hundreds of high-quality grants a year to individuals and small projects. We believe that we’re well-placed to fill the role of a significant independent grantmaker, because of a combination of our track record, our historical role in this position, and the quality of our fund managers.
Why now? We think now is an unusually good time to donate to us, as a) we have an unexpectedly large funding shortage, b) there are great projects on the margin that we can’t currently fund, and c) more stabilized funding now can give us time to try to find large individual and institutional donors to cover future funding needs.
Importantly, Open Philanthropy is no longer providing a guaranteed amount of funding to us and instead will move over to a (temporary) model of matching our funds 2:1 ($2 from them for every $1 from you, up to 3.5M from them per fund).
Where to donate: If you’re interested, you can donate to either Long-Term Future Fund (LTFF) or EA Infrastructure Fund (EAIF) here.[1]
Some relevant quotes from fund managers:
Oliver Habryka
I think the next $1.3M in donations to the LTFF (430k pre-matching) are among the best historical grant opportunities in the time that I have been active as a grantmaker. If you are undecided between donating to us right now vs. December, my sense is now is substantially better, since I expect more and larger funders to step in by then, while we have a substantial number of time-sensitive opportunities right now that will likely go unfunded.
I myself have a bunch of reservations about the LTFF and am unsure about its future trajectory, and so haven’t been fundraising publicly, and I am honestly unsure about the value of more than ~$2M, but my sense is that we have a bunch of grants in the pipeline right now that are blocked on lack of funding that I can evaluate pretty directly, and that those seem like quite solid funding opportunities to me (some of this is caused by a large number of participants of the SERI MATS program applying for funding to continue the research they started during the program, and those applications are both highly time-sensitive and of higher-than-usual quality).
Lawrence Chan
“My main takeaway from [evaluating a batch of AI safety applications on LTFF] is [LTFF] could sure use an extra $2-3m in funding, I want to fund like, 1/3-1/2 of the projects I looked at.” (At the current level of funding, we’re on track to fund a much lower proportion).
Related links
EA Funds organizational update: Open Philanthropy matching and distancing
Asya Bergal’s Reflections on my time on the Long-Term Future Fund
Linch Zhang’s Select examples of adverse selection in longtermist grantmaking
Our Vision
We think there is a significant shortage of independent funders in the current longtermist and EA infrastructure landscape, resulting in fewer outstanding projects receiving funding than is good for the world. Currently, the primary source of funding for these projects is Open Philanthropy, and whilst we share a lot of common ground, we think we add value in the following ways:
Increasing the total grantmaking capacity within key cause areas.
Causing great projects to counterfactually happen in the world, or saving time and effort for people doing great projects who would otherwise spend significant time fundraising or waiting for grants to come in.
Supporting a set of worldviews that we find plausible and that are not currently well represented among grantmakers (though we have substantial overlap with Open Philanthropy’s worldview and there is a range of views on how much we should be directly optimizing for diversification away from their perspectives).
Emphasizing contact with reality: most of our grantmakers spend most of their time trying to directly solve problems of importance within their cause area, rather than engaging in “meta” activities like grantmaking. We think this is important as grantmaking often has very poor feedback loops (particulalry longtermist grantmaking).
Provide early stage funding to allow applicants to test their fit for work and “get ready” to seek funding from other funders that specialize in larger grant sizes.
Improving the epistemic environment within EA by making it easier for smaller projects to disagree with Open Philanthropy without worrying that this will significantly reduce their chance of being funded in the future.
Helping to identify harmful projects whilst being aware of factors such as the unilateralist curse and information cascades.
Increasing the resilience, robustness and diversity of funders within EA and longtermism.
Alongside the above, EA Funds has ambitions to pursue new ways of generating value by:
Creating an expert-led active grant-making program to create counterfactual impactful projects (starting with longtermist information security).
Modeling and shaping community norms of transparency, integrity, and criticism to improve the epistemic environment within EA and associated communities.
Our Ask
We are looking to raise ~$4.4M from the general public to support our work over the next 6 months:
~$2.7M for the Long-Term Future Fund.
This is ~2M above our expected 720k donations in the next 6 months.
~$1.7m for the EA Infrastructure Fund.
This is ~1.3M above our expected 400k donations in the next 6 months.
This will be matched by Open Phil at a 2:1 rate ($2 from Open Phil per $1 donated to a fund) with a ceiling of a $3.5m contribution from Open Phil (per fund). You can read more about the matching here.
The EAIF and LTFF have received very generous donations from many individuals in the EA community. However, donations to the EAIF and LTFF have recently been quite low, especially relative to the quality and quantity of applications we’ve had in the last year. While much of this is likely due to the FTX crash and subsequently increased funding gaps of other longtermist organizations, our guess is that this is partially due to tech stocks and crypto doing poorly in the last year (though we hope that recent market trends will bring back some donors).
Calculation for LTFF funding gap
The LTFF has an estimated ideal dispersal rate of $1M/month, based on our post-November 2022 funding bar that Asya estimated[2] from looking at the funding gaps and marginal resources within the longtermist ecosystem overall. This is $6M over the next 6 months.
I also think LTFF donors should pay $200k over the next 6 months ($400k annualized) as their “fair share” of EA Funds operational costs. So in total, LTFF would like to spend $6.2M over the next 6 months.
Caleb estimated ~$700k in expected donations from individuals by default in the next 6 months, based solely on extrapolation from past trends. With Open Phil donation matching, this comes out to a total of $2.1M in expected incoming funds, or a shortfall of $4.1M.
To cover the remaining $4.1M, we would like individual donors to contribute an additional $2M, where Open Phil will provide $2.1M of matching for the first $1.05M.
To get a sense of what projects your marginal dollars can buy, you might find it helpful to look at the $5M tier of the LTFF Funding Thresholds Post.
Calculation for EAIF funding gap
The EAIF has an estimated ideal dispersal rate of $800k/month, based on the proportion of our historic spend rate that we believe is above Open Phil’s bar for EA community building projects (though note that this was based on fairly brief input from Open Phil and I didn’t check with them about whether they agree with this claim). This is $4.8M over the next 6 months.
I also think EAIF donors should pay $200k over the next 6 months ($400k annualized) as their “fair share” of EA Funds operational costs. So in total, EAIF would like to spend $5M over the next 6 months.
Caleb estimated $400k in expected donations from individuals by default in the next 6 months, based solely on extrapolation from past trends. With Open Phil donation matching, this comes out to a total of $1.2M in expected incoming funds, or a shortfall of $3.8M.
To cover the remaining $3.8M, we would like individual donors to contribute an additional $1.3M, where Open Phil will provide 2.5M in donation matching.
Potential change for operational expenses payment
Going forwards, we would also like to move towards a model where donors directly pay for our operational expenses (currently we fundraise for operational expenses separately, so 100% of donations from public donors goes to our grantees). We believe that the newer model is more transparent, as it lets all donors more clearly see the true costs and cost-benefit ratio for their donations. However, making the change is still pending internal discussions, community feedback, and logistical details. We will make a separate announcement if and when we switch to a model where a percentage of public donations go to cover our operational expenses. See Appendix A for a calculation of operational expenses.
Why give to EA Funds?
We think EA Funds is well-positioned to be a significant independent grantmaker for the following reasons.
We have knowledgeable part-time fund managers who do direct work in their day jobs: we have built several grantmaking teams with a broad range of expertise. These managers usually dedicate the majority of their time to hands-on efforts addressing critical issues. We believe this direct experience enhances their judgment as grantmakers, enabling them to pinpoint important and critical projects with high accuracy.
Specialization in early-stage grants: we made over 300 grants of under $300k in 2022. To our knowledge, that’s more grants of this size than any other EA-associated funder.
We are the largest open application funding source (that we are aware of) within our cause areas. Our application form is always open, anyone can apply, and grantees can apply for a wide variety of projects relevant to our funds’ purposes (as opposed to e.g. needing to cater to narrow requests for proposals). We believe this is critical to us having access to grant opportunities that other funders do not have access to, allowing us to rely on formal channels rather than informal networks.
Our operational track record. In 2022, EA Funds paid out ~$35M across its four Funds, with $12M to the Long-Term Future Fund, $13M to the EA Infrastructure Fund, $6.4M to the Animal Welfare Fund, and $4.8M to the Global Health and Development Fund. This requires (among others) clearing nontrivial logistical hurdles in following nonprofit law across multiple countries, consistent operational capacity, and a careful eye towards downside risk mitigation.
We believe our grant are highly cost-effective. Our current best guess is that we have successfully identified and given out grants of similar ex-ante quality to (e.g.) Open Phil’s AI safety and community building grants, some of which Open Phil would counterfactually not have funded.[3] This gives donors an opportunity to provide considerable value.
We are investigating new value streams. We would like to pursue ‘DARPA-style’ active grantmaking in priority areas (starting with information security). We are also actively considering setting up an AI Safety-specific fund, enocuraging donors interested in AI safety (but not EA or longtermism) to donate to projects that mitigate large-scale globally catastrophic AI risks.
According to GWWC, the LTFF is the main longtermist donation option available for individual donors to support. We believe that we are a relatively transparent funder, and we are currently thinking about how we can increase our transparency further whilst moving more quickly and maintaining our current standard of decision-making.
We are primarily looking for funding to support the Long-Term Future Fund and the EA Infrastructure Fund’s grantmaking.
The Long-Term Future Fund is primarily focused on reducing catastrophic risks from advanced artificial intelligence and biotechnology, as well as building and equipping a community of people focused on safeguarding humanity’s future potential. The EA Infrastructure Fund is focused on increasing the impact of projects that use the principles of effective altruism, in particular amplifying the efforts of people who aim to do an ambitious amount of good from an impartial welfarist and scope-sensitive perspective. We have included some examples of grants each fund has made in the highlighted grants section.
Our Fund Managers
We lean heavily on the experience and judgement of our fund managers. We have around five fund managers on each fund at any given time. [4]Our current fund managers include:
Linchuan Zhang (LTFF): Linchuan (Linch) Zhang is a Senior Researcher at Rethink Priorities working on existential security research. Before joining RP, he worked on time-sensitive forecasting projects around COVID-19. Previously, he programmed for Impossible Foods and Google and has led several EA local groups.
Oliver Habryka (LTFF): Oliver runs Lightcone Infrastructure, whose main product is Lesswrong. Lesswrong has significantly influenced conversations around rationality and AGI risk, and the LWits community is often credited with having realized the importance of topics such as AGI (and AGI risk), COVID-19, existential risk and crypto much earlier than other comparable communities.
Peter Wildeford (EAIF): co-executive director and co-founder of Rethink Priorities, a think tank dedicated to figuring out the best ways to make the world a better place.
Guest Fund Managers
Daniel Eth (LTFF): Daniel’s research has spanned several areas relevant to longtermism, and he’s currently focused primarily on AI governance. He was previously a Senior Research Scholar at the Future of Humanity Institute. He is currently self-employed.
Lauro Langosco (LTFF): Lauro is a PhD student with David Krueger at the University of Cambridge. His work focused broadly on AI Safety, in particular on demonstrations of alignment failures, forecasting AI capabilities, and scalable AI oversight.
Lawrence Chan (LTFF): Lawrence is a researcher at ARC Evals, working on safety standards for AI companies. Before joining ARC Evals, he worked at Redwood Research and as a PhD Student at the Center for Human Compatible AI at UC Berkeley.
Thomas Larsen (LTFF): Thomas was an alignment research contractor at MIRI, and he is currently running the Center for AI Policy, where he works on AI governance research and advocacy.
Clara Collier (LTFF): Clara is the managing editor of Asterisk, a quarterly journal focused on communicating insights on important issues. Before, she worked as an independent researcher on existential risks. She has a Masters in Modern Languages from Oxford.
Michael Aird (EAIF): Michael Aird is a Senior Research Manager in Rethink Priorities’ AI Governance and Strategy team. He also serves as an advisor to organizations such as Training for Good and is an affiliate of the Centre for the Governance of AI. His prior work includes positions at the Center on Long-Term Risk and the Future of Humanity Institute.
Huw Thomas (EAIF): Huw is currently working part-time on various projects (including a contractor role at 80,000 hours). Prior to this, he worked as a media associate at Longview Philanthropy, a groups associate at the Centre for Effective Altruism and was a recipient of the CEA Community Building Grant for his work at Effective Altruism Oxford.
You can find a full list of our fund managers here[5]
If you have more questions, feel free to leave a comment here. Caleb Parikh and the fund managers are also happy to talk to donors potentially willing to give >$30k. Linch Zhang, in particular, has volunteered himself to talk about the LTFF.
Highlighted Grants
EA Funds has identified a variety of high-impact projects, at least some of which we think are unlikely to have been funded elsewhere. (However, for any specific grant listed below, we think there’s a fairly high probability they’d otherwise be funded in some form or another; figuring out counterfactuals is often hard).
From the Long-Term Future Fund:
David Krueger - $200,000
Computing resources and researcher stipends at a new deep learning + AI alignment research group at the University of Cambridge.
Alignment Research Center - $72,000
A research & networking retreat for winners of the Eliciting Latent Knowledge contest with the aim of fostering promising research collaborations between junior researchers.
SERI MATS program - $316,000
8-week scholars program to pair promising alignment researchers with renowned mentors. This program has now grown into a more established program producing multiple people working full-time on alignment in established research organizations (with a smaller number of people pursuing independent research or starting new organizations).
Manifold Markets - $200,000
Stipend and expenses for 4 months for 3 FTE to build a forecasting platform made available to the public based on user-created play-money prediction markets
Daniel Filan - $23,544
We recommended a grant of $23,544 to pay Daniel Filan for his time making 12 additional episodes of the AI X-risk Research Podcast (AXRP), as well as the costs of hosting, editing, and transcription.
From the EA Infrastructure Fund:
Shauna Kravec & Nova DasSarma - $50,000:
Compute infrastructure and dedicated support for AI safety researchers to run technical AI experiments. This later became Hofvarpnir Studios which used to provide compute for Jacob Steinhardt’s lab at UC Berkeley and the Center for Human-Compatible Artificial Intelligence (CHAI).
Finlay Moorhouse and Luca Righetti - $38,200
Ongoing support for “Hear This Idea”, a podcast showcasing new thinking in effective altruism.
Laura Gonzalez Salmerón, Sandra Malagón - $43,308
12-month stipend to coordinate and grow the EA Spanish speakers community and its projects.
Czech Association for Effective Altruism - $ 8,300
Expenses and stipend to create a short Czech book (~130 pgs) and brochure (~20 pgs) with a good introduction to EA in digital and print formats.
See a complete list of our public grants at this link. You can also read the most recent payout report by LTFF here.
Planned actions over the next six months
To achieve our goals of empowering thoughtful people to pursue impactful projects, we’ll attempt to do the following:
Asya Bergal will step down as chair of LTFF (Max Daniel has already stepped down as chair of the EAIF). Max and Asya both work for Open Phil, and we want to increase our separation from Open Phil. [6]
Open Phil also wanted to reduce entanglements between the two organizations, in part to mitigate downside reputational risks.
We are looking to find new fund chairs for both LTFF and EAIF.
We plan to onboard more fund managers to grow each fund substantially (aiming to double the staffing of each fund).
In recent months, LTFF has onboarded Lauro Langosco and Lawrence Chan who will primarily focus on technical alignment grantmaking, as well as Clara Collier for her expertise in communications and general longtermism. The EAIF is in the process of onboarding new fund managers.
Open Phil has agreed to give us a 2:1 match for up to $7M total (up to $3.5M to each of EAIF and LTFF) for a 6-month period. While our ultimate goal is to develop our own robust funding base, in 2022, Open Philanthropy provided 40% of the funding for the Long-Term Future Fund and 84% for the EA Infrastructure Fund.[7] We see donation matching as a realistic intermediary step while enabling us to pursue more intellectual independence.
This model replaces fixed grants from Open Philanthropy. This reduces the likelihood of your donations being fungible: previously an extra $1 to EA Funds in fundraising could result in a $1 reduction in Open Philanthropy’s grants to us, diverting those funds to their other projects. This newer approach allows funders to donate to EA Funds and support the specific value proposition that we, as opposed to Open Philanthropy, present. [8]
We are considering hiring or contracting out more non-grantmaking duties (eg website, project management, fundraising, communications) at EA Funds. Right now Caleb is the only full-time employee of EA Funds and plausibly having 0.5-1.5 more FTEs at EA Funds will help both existing projects go more smoothly, as well as unlock new ambitious opportunities.
We are working with external investigators to do retroactive evaluations of past EAIF and LTFF grants, with the hopes that we can then have a clearer picture of a) how well the impact of our past grants compares to e.g., Open Phil’s, b) which of our broader categories of historical grants have been the most impactful, and c) other qualitative insights to help us improve further.
We aim to improve the operations of our passive grantmaking (funding of open grant applications) program with a focus on improving the grantee experience by providing more support to grantees and getting back to grantees much more quickly[9]
We are trying to reconceptualize and reframe the value proposition and strategic direction of EAIF in the coming months. While much of this will be contingent on the vision of the incoming fund chair, we’d like EAIF to have a more coherent and targeted vision, strategy, and coherent value proposition to donors going forwards.
We plan to create a new AI Safety specific program, for donors outside of EA/Longtermism who want to decrease catastrophic risks from AI. We hope that such a program can inspire new donors to give to AI safety projects.
EA Funds is pursuing active grant-making programs, where we’ll actively seek out promising projects to fund. We’ll initially focus on Information Security field building. The current plan is for this program to initially be funded by Open Philanthropy, though if you are interested in contributing to this program in particular, please let us know.
Potential negatives to be aware of
Here are some reasons you might not want to donate to EA Funds:
Potential downside risks of LTFF or EAIF
Inability to fully screen for or prevent unilateral downside risks: EA Funds has much less control over and offers less guidance to our grantees than, e.g., the executive directors of a moderately-sized EA organization. So compared to larger organizations, we may be less able to prevent unilateral downside risks like the sharing of information hazards, or actions that pose reputational risks to effective altruism at large, or to specific EA subfields.
Centralization of funds: In contrast, we are also implicitly asking for the centralization of funds from private donors to a single grantmaking entity. To the extent that you believe your counterfactual for donating to EA Funds is better and/or more centralization is bad, you may wish to donate directly rather than pool your funds with other LTFF or EAIF donors.
Waste/Inefficient usage of human capital: Giving money to EA Funds rather than larger organizations implicitly subsidizes a culture and community of grantseekers who are supported by small grants. To the extent that you believe this is a less efficient usage of human capital than plausible counterfactuals for talented people (e.g. getting a job in tech, policy, or academia), you might want to shift away from EA grantmakers that give relatively small individual grants.
Note that we consider these issues to be structural and do not realistically expect resolutions to these downside risks going forwards.
Areas of improvement for the LTFF and EAIF
Historically, we’ve had the following (hopefully fixable) problems:
Slower than ideal response times: in the past year, our median response time has been around 4 weeks with high variance; we’d like to get this down to closer to 2 weeks with 95% of applications responded to in 4 four weeks.
Limited feedback/advice given to grantees: we generally don’t give feedback to rejected applicants. We currently give some feedback to promising grantees but much less than we’d give if we had more grantmaking capacity.
Insufficient active grantmaking: We spend some time trying to improve our grantees’ projects, but we have invested fairly little in active grantmaking (actively identifying promising projects and creating/supporting them).
Missing areas of subject matter expertise: The scopes of both funds are quite expansive. This means sometimes all of the existing grantmakers lack sufficient direct technical subject matter expertise to evaluate grants in certain areas, and thus have to rely on external experts. For example, the LTFF does not currently have a technical expert in biosecurity.
For more, you can read Asya’s reflections on her time as chair of LTFF.
EAIF vs LTFF
Some donors are interested in giving to both the EAIF and LTFF and would like advice on which fund is a better fit for them.
We think that the EAIF is a better fit for donors who:
Are interested in supporting a portfolio of meta projects covering a range of plausible worldviews (both longtermist and non-longtermist).
Are interested in building EA and adjacent communities.
Believe that EA (and EA community building) has historically been very good for the world.
Believe in multiplier effect arguments (donating $100 to an EA group could plausibly create far more than $100 in donation to high-impact charities by encouraging more people to donate).
Expect the EAIF and LTFF to have similar diminishing marginal returns curves and want to donate to the fund with lower funding. (EAIF and LTFF each receive about 1000 grant applications per year, but EAIF has less funding currently committed)
We think that the LTFF is a better fit for donors who are:
More compelled by longtermist cause areas than other EA cause areas.
Particularly interested in AI safety.
Are more interested in direct work than “meta” work that have a longer chain of impact/reasoning.
Are more excited about the $5M tier of marginal LTFF grants than what they consider to be the marginal EAIF grant.
Closing thoughts
This post was written by Caleb Parikh and Linch Zhang. Feel free to ask questions or give us feedback in the comments below.
If you are interested in donating to either LTFF or EAIF, you can do so here.
Appendix A: Operational expenses calculations and transparency.
In the last year, EA Funds has dispersed $35M and spent ~700k in operational expenses. The vast majority of the operational expenses were spent on LTFF and EAIF, as the global health and development fund and animal welfare fund are operationally much simpler.
Historically, ~60-80% of the operational expenses are paid to EV Ops, for grant disbursement, tech, legal, other ops, etc.
The remaining 20-40% is used for:
Caleb’s salary, who leads EA Funds (~$100k/year plus benefits).
Payments for grantmakers at $60/hour, though many volunteer for free.
Contractors who work on different projects, earning between $35-$100/hour.
I (Linch) ballparked the expected annual expenditures going forwards (assuming no cutbacks) to be ~800k annually. I estimated the increase due to a) inflation and b) us wanting to take on more projects, with some savings from us slowing down the rate of dispersals a little. But this estimate is not exact.
Since LTFF and EAIF incur the highest expenses, I suggest donors to each should contribute around $400k yearly, or $200k every six months.
As for where we might cut or increase spending:
Reducing EV Ops costs would be challenging and may require moving EA Funds out of EV and building our own grant ops team.
Reducing Caleb’s working hours would be challenging.
I think my own hours at EAF are somewhat contingent on operational funding. In the last month, I’ve been spending more than half of my working hours on EA Funds (EA Funds is buying out my time at RP), mostly helping Caleb with communications and strategic direction. I will like to continue doing this until I believe EA Funds is in a good state (or we decide to discontinue or sunset projects I’m involved in). Obviously whether there is enough budget to pay for my time is a crux for whether I should continue here.
Assuming we can pay for my time, other plausible uses of marginal operational funding include: a) whether we pay external investigators for extensive or just shallow retroactive evaluations, b) whether we attempt to launch new programs, c) whether the new infosec, AI safety project, etc websites have professional designers, etc. My personal view is that marginal spending on EA Funds expenses is quite impactful relative to other possible donations, but I understand if donors do not feel the same way and will prefer a higher percentage of donations go directly to our grantees (currently it’s 100% but proposed changes may move this to ~ 94-97%).
I would wire you guys 300-400K today if I wasn’t still worried about the theory that ‘AI Safety is actually a front for funding advancement of AI capabilities’. It is a quixotic task to figure out how true that theory is or what actually happened in the past, neverminded why. But the theory seems at least kind of true to me and so I will not be donating.
Its unlikely to be worth your time to try to convince me to donate. But maybe other potential donors would appreciate a reassurance its not actively net-negative to donate. For example several people mentioned in the post have ties to dangerous organizations such as Anthropic.
Meta-honesty: There is not enough values alignment to trust me with sensitive information and I definitely do not endorse ‘always keep secrets you agreed to keep’. I support leaking the pentagon papers, etc.
My own professional opinion, not speaking for any other grantmakers or giving an institutional view for LTFF etc:
Yeah I sure can’t convince you that donating to us is definitely net positive, because such a claim wouldn’t be true.
So basically I don’t think it’s possible to do robustly positive actions in longtermism with high (>70%? >60%?) probability of being net positive for the long-term future[1], and this number is even lower for people who don’t place the majority of their credence on near- to medium-term extinction risk timelines.
I don’t think this is just an abstract theoretical risk, as you mention there’s a real risk that our projects are net negative; and advancing more AI capabilities than AI safety is the most obvious way that this is true.
I think the other LTFF grantmakers and I are pretty conscious about downside risks in capabilities enhancements, though I expect there’s a range of opinions on the fund on how much to weigh that against other desiderata, as well as which specific projects have the highest capabilities externalities.
I would guess that we’re better about this than most (all?) other significant longtermist funders, including both organizations and individuals (though keep in mind that the average for individuals is driven by the long left tail). But since we’re optimizing for other things as well (most importantly positive impact), I think we’d do worse than you would on this axis if you a) have reasonably good judgment b) are laser-focused on preventing capabilities externalities, and c) have access to good donation options directly, especially by your own worldview. And of course reality doesn’t grade on a curve, so doing better than other funders isn’t a guarantee we’re doing well enough.
I don’t do much evaluations of alignment grants myself because others on the fund seem more technically qualified so my time is usually triaged to looking at other projects (eg forecasting, biosecurity). But I do try to flag downside risks I see in ltff grants overall, including in alignment grants. (So far, I think the rest of the fund is sensible about capabilities risks and capabilities risks usually aren’t the type of thing that non-public information is super useful for, so possibly none of my flags were on capabilities, more like interpersonal harm or professional integrity). When I did, I’ve found the rest of the fund to be sensible about them. You might find this recent post to be useful.
(On the flip side, there were a small number of grants that I liked that we were blocked from making for legal or PR reasons; for the most promising ones, one of us tried to connect the applicant to other funders)
If I were to hypothesize why LessWrongers should be worried about our capabilities externalities:
I think the average view in the fund (both unweighted and weighted by votes on alignment grants) is more optimistic on prosaic AI alignment strategies than what I perceive the median LessWrong view to be.
I expect under most worldviews, prosaic AI alignment to have more capabilities externalities than other research agendas
To be clear, I don’t think views in the fund to be out-of-line with working AI safety researchers; I think the louder (and probably median?) voices on LessWrong are more negative on prosaic approaches.
Some of our grantees go on to work at AI labs like Anthropic or DeepMind, which many people here would consider to be bad for the world.
My own weakly-to-moderately held view is that doing AI Safety work at big labs is a good-to-great idea, but don’t think the case is very robust and reasonable people can and should disagree.
As you allude, an important crux is whether/how much the work at the labs end up being safety-washing
I’m personally fairly against working at big labs in non-safety roles; the capabilities externalities just seem rather high, and the career capital argument seem a) both not that high compared to getting a random ML job at Google doing ads or working at collision detection at Tesla or something and b) to rely implicitly on a certain willingness to defect for personal gain.
The moral mazes and institutional/cultural incentives to warp your beliefs seem pretty scary to me, but I don’t have a good solution.
We are not institutionally opposed to receiving money from employees at big labs
Though as an empirical matter I don’t think we’ve received much.
The ecosystem/memes/vibes near us has in fact resulted in a bunch of negative externalities before, there’s no guarantee we wouldn’t cause the same.
We haven’t tracked past negative externalities/negative impact grants very well, so I couldn’t eg point to our10 worst grants ex post with an estimate of how bad there were (but we’re working on this).
We didn’t see the FTX crash coming.
I also think potential donors to us can also just look at our past grants database, our payout report, or our marginal grants post to make an informed decision for themselves about whether donations to us are (sufficiently) net positive in expectation.
On a personal level:
I don’t really know, man? I think the longtermist/rationalist EA memes/ecosystem were very likely causally responsible for some of the worst capabilities externalities in the last decade; I don’t have a sense of how bad it is overall because counterfactuals are really hard but I don’t think it’s plausible that the negative impact was small. I’m pretty confused about whether people with thought process like me have been historically net positive or net negative; I can see a strong case either way. The whole thing had a pretty direct effect on me being depressed for most of this year (with the obvious caveat that etiology is hard for mental illness stuff, and being sad for cosmic reasons is one of the most self-flattering stories I could have for melancholy). Interestingly, I think the emotional effect is much larger than I would’ve ex ante predicted, if you asked me in 2017 if I thought longtermist work might be net negative, I don’t think my numbers would’ve been that different; I guess the specific details and concreteness did matter.
I have a lot of sympathy for people who decided to be a bit more checked out of morality, or decided to give up on this whole AI thing and focus on just reducing suffering in the next few decades (I think farmed animal welfare is the most popular candidate). But ultimately I think they’re wrong. The future is still going to be big, and likely really wild, and likely at least somewhat contingent. Knowing (or at least having a high probability) that people near us did a bunch of harmful stuff in the past is certainly an argument for being much more careful going forwards (as well as a number of more concrete and specific updates), but not really a good case to just roll over. (In the abstract, I do think it’s more plausible that for some people acting now is wrong compared to retreating to the woods for a year and thinking really hard; as an empirical matter when I did weaker versions of that, the effect was basically between useless and negative).
I think it’s a bit more feasible if you’re willing to make >3 OOMs sacrifice in expected positive impact. But still pretty rough. Some green energy stuff might be safe? Maybe try to convince doomsday preppers to be nicer people? I confess to not thinking much about it; I think some of the Oxford people might have a better idea.
I truly. truly appreciate reading this.
If you’re thinking of the work I’m thinking of, I think about zero of it came from people aiming at safety work and producing externalities, and instead about all of it was people in the community directly working on capabilities or capabilities-adjacent projects, with some justification or the other.
(personal opinions)
Yeah most of the things I’m thinking of didn’t look like technical safety stuff, more like Demis and Shane being concerned about safety → decided to found Deepmind, Eliezer introducing Demis and Shane to Peter Thiel ( their first funder), etc.
In terms of technical safety stuff, sign confusion around RLHF is probably the strongest candidate. I’m also a bit worried about capabilities externalities of Constitutional AI, for similar reasons. There’s also the general vibes issue of safety work (including quite technical work) and communications either making AI capabilities seem more cool* or seem less evil (depending on your framing).
EDIT to add: I feel like in Silicon Valley (and maybe elsewhere but I’m most familiar with Silicon Valley) there’s a certain vibe of coolness being more important than goodness, which feels childish to me but afaict seems like a real thing. This Altman tweet seems emblematic of that mindset.
Yeah, I definitely think this is true to some extent. “First get impact, then worry about the sign later” and all.
This seems like an important point, and it’s one I’ve not heard before. (At least, not outside of cluelessness or specific concerns around AI safety speeding up capabilities; I’m pretty sure that most EAs I know have ~100% confidence that what they’re doing is net positive for the long-term future.)
I’m super interested in how you might have arrived at this belief: would you be able to elaborate a little? For instance, is there a theoretical argument going on here, like a weak form of cluelessness? Or is it more empirical, for example, did you get here through evaluating a bunch of grants and noticing that even the best seem to carry 30-ish percent downside risk? Something else?
Really? Without giving away names, can you tell me roughly what cluster they are in? Geographical area, age range, roughly what vocation (technical AI safety/AI policy/biosecurity/community building/earning-to-give)?
Definitely closer to the former than the latter! Here are some steps in my thought process:
The standard longtermist cluelessness arguments (“you can’t be sure if eg improving labor laws in India is good because it has uncertain effects on the population and happiness of people in Alpha Centauri in the year 4000”) doesn’t apply in full-force if you buy high near-term (10-100 years) probability of AI doom, and that AI doom is astonomically bad and avoidable.
or (less commonly on LW but more common in some other EA circles) other sources of hinge of history like totalitarian lock-in, s-risks, biological tech doom, etc
If you assign low credence in any hinge of history hypothesis, I think you are still screwed by the standard cluelessness arguments, unfortunately.
But even with a belief in x-risk hinge of history, cluelessness still apply significantly. Knowing whether an action reduces x-risk is much easier in relative terms than knowing whether an action will improve the far future in the absence of x-risk, but it’s still hard in absolute terms.
If we drill down on a specific action and a specific theory of change (“I want to convince a specific Senator to sign a specific bill to regulate the size of LLM models trained in 2024”, “I want to do this type of technical research to understand this particular bug in this class of transformer models, because better understanding of this bug can differentially advance alignment over capabilities at Anthropic if Anthropic will scale up this type of model”), any particular action’s impact is just built on a tower of conjunctions and it’s really hard to get any grounding to seriously argue that it’s probably positive.
So how do you get any robustness? You imagine the set of all your actions as slightly positive bets/positively biased coin flips (eg a grantmaker might investigate 100+ grants in a year, something like deconfusion research might yield a number of different positive results, field-building for safety might cause a number of different positive outcomes, you can earn-to-give for multiple longtermist orgs, etc). If heads are “+1” and tails are “-1″, and you have a lot of flips, then the central limit theorem gets you a nice normal distribution with a positive mean and thin tails.
Unfortunately the real world is a lot less nice than this because:
the impact of your different actions are heavy-tailed, likely in both directions.
A concrete example is that maybe a really unexpectedly bad grant can wipe out all of the positive impact your good grants have gotten, and then some.
the impact and theories of change of all your actions likely share a worldview and have internal correlations
eg, “longtermist EA fieldbuilding” have multiple theories of impact, but you can be wrong about a few important things and e.g. (almost) all of them might end up differentially advancing capabilities over alignment, in very correlated ways.
You might not have all that many flips that matter
The real world is finite, your life is finite, etc, so even if in the limit your approach is net positive, there’s no guarantee that in practice your actions are net positive before either you die or the singularity happens.
That doesn’t mean it’s wrong to dedicate your life to a single really important bet! (as long as you are obeying reasonable deontological and virtue ethics constraints, you’re trying your best to be reasonable, etc).
For people in those shoes, a possibly helpful mental motion is to try to think less of individual impact and more communally. Maybe it’s like voting: individual votes are ~useless but collectively people-who-think-like-you can hopefully vote for a good leader. If enough people-like-you follow an algorithm of “do unlikely-to-work research projects that are slightly positive in expectation”, collectively we can do something important.
probably a few other things I’m missing.
So the central modeling issues become a) how many flips you get, b) how likely all the flips are dominated by a single coin, c) how much internal correlation there is between each coin flip.
And my gut is like, it seems like you get a fair number of flips, it’s reasonably likely but not certain that one (or a few) flips dominate, and the internal correlation is high but not 1(and not very close to 1).
There’s a few more thoughts I have but that’s the general gist. Unfortunately it’s not very mathematical/quantitive or much of a model; my guess is that both more conceptual thinking and more precise models can yield some more clarity, but ultimately we (or at least I) will still end up fairly confused even after that.
I’m also interested in thoughts from other people here; I’m sure I’m not the only person who is worried about this type of thing.
(Also please don’t buy my exact probabilities. They are very much not resilient. Like I’m pretty sure if I thought about it for 10 years (without new empirical information) the probability can’t be much higher than 90%, and I’m pretty sure the probabilities are high enough to be non-Pascalian, so not as low as say 50% + 1-in-a-quadrallion, but anywhere in between seems kinda defensible).
“I’m pretty sure that most EAs I know have ~100% confidence that what they’re doing is net positive for the long-term future”
Fwiw, I think this is probably true for very few if any of the EAs I’ve worked with, though that’s a biased sample.
I wonder if the thing giving you this vibe might be they they actually think something like “I’m not that confident that my work is net positive for the LTF but my best guess is that it’s net positive in expectation. If what I’m doing is not positive, there’s no cheap way for me to figure it out, so I am confident (though not ~100%) that my work will keep seeming positive EV to me for the near future.” One informal way to describe this is that they are confident that their work is net positive in expectation/ex ante but not that it will be net positive ex post
I think this can look a lot like somebody being ~sure that what they’re doing is net positive even if in fact they are pretty uncertain.
One way I think about this is there are just so many weird (positive and negative) feedback loops and indirect effects, so it’s really hard to know if any particular action is good or bad. Let’s say you fund a promising-seeming area of alignment research – just off the top of my head, here are several ways that grant could backfire:
• the research appears promising but turns out not to be, but in the meantime it wastes the time of other alignment researchers who otherwise would’ve gone into other areas
• the research area is promising in general, but the particular framing used by the researcher you funded is confusing, and that leads to slower progress than counterfactually
• the researcher you funded (unbeknownst to you) turns out to be toxic or otherwise have bad judgment, and by funding him, you counterfactually poison the well on this line of research
• the area you fund sees progress and grows, which counterfactually sucks up lots of longtermist money that otherwise would have been invested and had greater effect (say, during crunch time)
• the research is somewhat safety-enhancing, to the point that labs (facing safety-capabilities tradeoffs) decide to push capabilities further than they otherwise would, and safety is hurt on net
• the research is somewhat safety-enhancing, to the point that it prevents a warning shot, and that warning shot would have been the spark that would have inspired humanity to get its game together regarding combatting AI X-risk
• the research advances capabilities, either directly or indirectly
• the research is exciting and draws the attention of other researchers into the field, but one of those researchers happens to have a huge, tail negative effect on the field outweighing all the other benefits (say, that particular researcher has a very extreme version of one of the above bullet points)
• Etcetera – I feel like I could do this all day.
Some of the above are more likely than others, but there are just so many different possible ways that any particular intervention could wind up being net negative (and also, by the same token, could alternatively have indirect positive effects that are similarly large and hard to predict).
Having said that, it seems to me that on the whole, we’re probably better off if we’re funding promising-seeming alignment research (for example), and grant applications should be evaluated within that context. On the specific question of safety-conscious work leading to faster capabilities gains, insofar as we view AI as a race between safety and capabilities, it seems to me that if we never advanced alignment research, capabilities would be almost sure to win the race, and while safety research might bring about misaligned AGI somewhat sooner than it otherwise would occur, I have a hard time seeing how it would predictably increase the chances of misaligned AGI eventually being created.
I’m not sure which of the people “have ties to dangerous organizations such as Anthropic” in the post (besides Shauna Kravec & Nova DasSarma, who work at Anthropic), but of the current fund managers, I suspect that I have the most direct ties to Anthropic and OAI through my work at ARC Evals. I also have done a plurality of grant evaluations in AI Safety in the last month. So I think I should respond to this comment with my thoughts.
I personally empathize significantly with the concerns raised by Linch and Oli. In fact, when I was debating joining Evals last November, my main reservations centered around direct capabilities externalities and safety washing.
I will say the following facts about AI Safety advancing capabilities:
Empirically, when we look at previous capability advancements produced by people working in the name of “AI Safety” from this community, the overwhelming majority were produced by people who were directly aiming to improve capabilities.
That is, they were not capability externalities from safety research, so much as direct capabilities work.
E.g, it definitely was not the case that GPT-3 was a side effect of alignment research, and OAI and Anthropic are both orgs who explicitly focus on scaling and keeping at the frontier of AI development.
I think the sole exception are a few people who started doing applied RLHF research. Yeah, I think the people who made LLMs commercially viable via did not do a good thing. My main uncertainty is what exactly happened here and how much we contribute to this on the margin.
I generally think that research is significantly more useful when it is targeted (this is a very common view in the community as well). I’m not sure what the exact multiplier is, but I think targeted, non-foundational research is probably 10x more effective than incidentally related research. So the net impact of safety research on capabilities via externalities is probably significantly smaller than the impact of safety research on safety research, or the impact of targeted capabilities research on capabilities research.
I think this point is often overstated or overrated, but the scale of capabilities researchers at this point is really big, and it’s easy to overestimate the impact of one or two particular high profile people.
For what it’s worth, I think that if we are to actually produce good independent alignment research, we need to fund it, and LTFF is basically the only funder in this space. My current guess is a lack of LTFF funding is probably producing more researchers at Anthropic than otherwise, because there just that aren’t many opportunities for people to work on safety or safety-adjacent roles. E.g. I know of people who are interviewing for Anthropic capability teams because idk man, they just want a safety-adjacent job with a minimal amount of security, and it’s what’s available. Having spoken to a bunch of people, I strongly suspect that of the people that I’d want to fund but won’t be funded, at least a good fraction are significantly less likely to join a scaling lab if they were funded, and not more.
(Another possibly helpful datapoint here is that I received an offer from Anthropic last december, and I turned them down.)
I think this is true at the current margin, because we have so limited money.. But if we receive say enough funding to lower the bar to closer to what our early 2023 bar was, I will still want to make skill-up grants to fairly talented/promising people, and I still think they are quite cost-effective. I do expect those grants to have more capabilities externalities (at least in terms of likelihood, maybe in expectation as well) than when we give grants to people who currently could be hired at (eg) Anthropic but choose not to.
It’s possible you (and maybe Oli?) disagree and think we should fund moderate-to-good direct work projects over all (or almost all) skillup grants; in that case this is a substantive disagreement about what we should do in the future.
That feels concerning. Are there any obvious things that would help with this situation, eg: better career planning and reflection resources for people in this situation, AI safety folks being more clear about what they see as the value/disvalue of working in those types of capability roles?
Seems weird for someone to explicitly want a “safety-adjacent” job unless there are weird social dynamics encouraging people to do that even when there isn’t positive impact to be had from such a job.
FWIW, I am also very worried about this and it feels pretty plausible to me. I don’t have any great reassurances, besides me thinking about this a lot and trying somewhat hard to counteract it in my own grant evaluations, but I only do a small minority of grant evaluations on the LTFF these days.
I do want to clarify that I think it’s unlikely that AI Safety is a front for advancing AI capabilities. I think the framing that’s more plausibly true is that AI Safety is a memespace that has undergone regulatory capture by capability companies and people in the EA network to primarily build out their own influence over the world.
Their worldviews is of course heavily influenced by concerns about the future of humanity and how it will interact with AI, but in a way that primarily leverages symmetric weapons and does not involve much of any accountability or public reasoning about their risk models, which seem substantially skewed by the fact that people are making billions of dollars off of advances in AI capabilities, and are substantially worried that people they don’t like will get to control AI.
I do also think this is just one framing, and there are a lot of other things going on.
Have you looked at Orthogonal? They’re pretty damn culturally inoculated against doing-capabilities-(even-by-accident), and they’re extremely funding constrained.
UPDATE 2023/09/13:
Including only money that has already landed in our bank account and extremely credible donor promises of funding, LTFF has raised ~1.1M and EAIF has raised ~500K. After Open Phil matching, this means LTFF now has ~3.3M additional funding and EAIF has ~1.5m in additional funding.
We are also aware that other large donors, including both individuals and non-OP institutional donors, are considering donating to us. In addition, while some recurring donors have likely moved up their donations to us because of our recent unusually urgent needs, it is likely that we will still accumulate some recurring donations in the coming months as well. Thus,I think at least some of the less-certain sources of funding will come through. However, I decided to conservatively not include them in the estimate above.
From my (Linch)’s perspective, this means both LTFF nor EAIF are no longer very funding constrained for the time period we wanted to raise money for (the next ~6 months), however both funds are still funding constrained and can productively make good grants with additional funding.
To be more precise, we estimated a good target spend rate for LTFF is as 1M/month, and a good target spend rate for EAIF as ~800k/month. The current funds will allow LTFF to spend ~550k/month and EAIF to spend ~250k/month, or roughly a gap of 450k/month and 550k/month, respectively. More funding is definitely helpful here, as more money will allow both funds to make productively make good grants[1].
Open Phil’s matching is up to 3.5M from OP (or 1.75M from you) for each fund. This means LTFF would need ~650k more before maxing out on OP matching, and EAIF would need ~1.25M more. Given my rough estimate of funding needs above, which is ~6.2M/6 months for LTFF and ~5M/6 months for EAIF, this means LTFF would ideally like to receive 1M above the OP matching.
I appreciate donors’ generosity and commitment to improving the world. I hope the money will be used wisely and cost-effectively.
I plan to write a high-level update and reflections post[2] on the EAForum (crossposted to LessWrong) after LTFF either a) reach our estimated funding target or b) decided to deprioritize fundraising, whichever one comes earlier.
I’d be happy for you guys to send some grants my way for me to fund via my Manifund pot if it’d be helpful.
Thank you, that would be great!
I am a smaller doner (<$10k/yr) who has given to the LTFF in the past. As a data point, I would be very interested in giving to a dedicated AI Safety fund.
It would help at least me if I could donate to you through every.org, so I can use my standard interface.
That’s helpful feedback; if others would find donating through every.org helpful (which they can signal by agree-voting with the parent comment), I’d be happy to look into this.
I think we can be very flexible for donations over $30k, so if you’re interested in making a donation of that size feel free to dm me and I am sure we can figure something out.
(Why doesn’t this post have the “crossposted to EA Forum” thingie in the comments section?)
The crosspost was manual rather than automatic; for some reason the automatic crossposting doesn’t work for me (Chrome on MacOS)
Crossposting has been broken for 1-2 months; the LW and EAF teams know about this.