Computer science master’s student interested in AI and AI safety.
Stephen McAleese
GPT-4 Predictions
An Overview of the AI Safety Funding Situation
Estimating the Current and Future Number of AI Safety Researchers
Summary of “AGI Ruin: A List of Lethalities”
Could We Automate AI Alignment Research?
Retrospective on ‘GPT-4 Predictions’ After the Release of GPT-4
This sounds more or less correct to me. Open Philanthropy (Open Phil) is the largest AI safety grant maker and spent over $70 million on AI safety grants in 2022 whereas LTFF only spent ~$5 million. In 2022, the median Open Phil AI safety grant was $239k whereas the median LTFF AI safety grant was only $19k in 2022.
Open Phil and LTFF made 53 and 135 AI safety grants respectively in 2022. This means the average Open Phil AI safety grant in 2022 was ~$1.3 million whereas the average LTFF AI safety grant was only $38k. So the average Open Phil AI safety grant is ~30 times larger than the average LTFF grant.
These calculations imply that Open Phil and LTFF make a similar number of grants (LTFF actually makes more) and that Open Phil spends much more simply because its grants tend to be much larger (~30x larger). So it seems like funds may be more constrained by their ability to evaluate and fulfill grants rather than having a lack of funding. This is not surprising given that the LTFF grantmakers apparently work part-time.
Counterintuitively, it may be easier for an organization (e.g. Redwood Research) to get a $1 million grant from Open Phil than it is for an individual to get a $10k grant from LTFF. The reason why is that both grants probably require a similar amount of administrative effort and a well-known organization is probably more likely to be trusted to use the money well than an individual so the decision is easier to make. This example illustrates how decision-making and grant-making processes are probably just as important as the total amount of money available.
LTFF specifically could be funding-constrained though given that it only spends ~$5 million per year on AI safety grants. Since ~40% of LTFF’s funding comes from Open Phil and Open Phil has much more money than LTFF, one solution is for LTFF to simply ask for more money from Open Phil.
I don’t know why Open Phil spends so much more on AI safety than LTFF (~14x more). Maybe it’s simply because of some administrative hurdles that LTFF has when requesting money from Open Phil or maybe Open Phil would rather make grants directly.
Here is a spreadsheet comparing how much Open Phil, LTFF, and the Survival and Flourishing Fund (SFF) spend on AI safety per year.
Plug: I recently published a long post on the EA Forum on AI safety funding: An Overview of the AI Safety Funding Situation.
GPT-4 is the model that has been trained with the most training compute which suggests that compute is the most important factor for capabilities. If that wasn’t true, we would see some other company training models with more compute but worse performance which doesn’t seem to be happening.
LLMs aren’t that useful for alignment experts because it’s a highly specialized field and there isn’t much relevant training data. The AI Safety Chatbot partially solves this problem using retrieval-augmented generation (RAG) on a database of articles from https://aisafety.info. There also seem to be plans to fine-tune it on a dataset of alignment articles.
Thanks for the post. I think it’s a valuable exercise to think about how AI safety could be accelerated with unlimited money.
I think the Manhattan Project idea is interesting but I see some problems with the analogy:
The Manhattan Project was originally a military project and to this day, the military is primarily funded and managed by the government. But most progress in AI today is made by companies such as OpenAI and Google and universities like the University of Toronto. I think a more relevant project is CERN because it’s more recent and focused on the non-military development of science.
The Manhattan Project happened a long time ago and the world has changed a lot since then. The wealth and influence of tech companies and universities is probably much greater today than it was then.
It’s not obvious that a highly centralized effort is needed. The Alignment Forum, open source developers, and the academic research community (e.g. the ML research community) are examples of decentralized research communities that seem to be highly effective at making progress. This probably wasn’t possible in the past because the internet didn’t exist.
I highly doubt that it’s possible to recreate the Bay Area culture in a top-down way. I’m pretty sure China has tried this and I don’t think they’ve succeeded.
Also, I think your description is overemphasizing the importance of geniuses like Von Neumann because 130,000 other people worked on the Manhattan Project too. I think something similar has happened at Google today where Jeff Dean is revered but in reality, I think most progress at Google is done by the tens of thousands of the smart but not genius dark matter developers there.
Anyway, let’s assume that we have a giant AI alignment project that would cost billions. To fund this, we could:
Expand EA funding substantially using community building.
Ask the government to fund the project.
The government has a lot of money but it seems challenging to convince the government to fund AI alignment compared to getting funding from EA. So maybe some EAs with government expertise could work with the government to increase AI safety investment.
If the AI safety project gets EA funding, I think it needs to be cost-effective. The reality is that only ~12% of Open Phil’s money is spent on AI safety. The reason why is that there is a triage situation with other cause areas like biosecurity, farm animal welfare, and global health and development so the goal is to find cost-effective ways to spend money on AI safety. The project needs to be competitive and has more value on the margin than other proposals.
In my opinion, the government projects that are most likely to succeed are those that build on or are similar to recent successful projects and are in the Overton window. For example:
AI Centres for Doctoral Training in the UK: funding PhD students in the UK to work on AI projects such as AI safety.
The NSF Safe Learning-Enabled Systems: US government funding for academic research groups and non-profits to work on AI safety.
My guess is that leveraging academia would be effective and scalable because you can build on the pre-existing talent, leadership, culture, and infrastructure. Alternatively, governments could create new regulations or laws to influence the behavior of companies (e.g. GDPR). Or they could found new think tanks or research institutes possibly in collaboration with universities or companies.
As for the school ideas, I’ve heard that Lee Sedol went to a Go school and as you mentioned, Soviet chess was fueled by Soviet chess programs. China has intensive sports schools but I doubt these kinds of schools would be considered acceptable in Western countries which is an important consideration given that most of AI safety work happens in Western countries like the US and UK.
In science fiction, there are even more extreme programs like the Spartan program in Halo where children were kidnapped and turned into super soldiers, or Star Wars where clone soldiers were grown and trained in special facilities.
I don’t think these kinds of extreme programs would work. Advanced technologies like human cloning could take decades to develop and are illegal in many countries. Also, they sound highly unethical which is a major barrier to their success in modern developed countries like the US and especially EA-adjacent communities like AI safety.
I think a more realistic idea is something like the Atlas Fellowship or SERI MATS which are voluntary programs for aspiring researchers in their teens or twenties.
The geniuses I know of that were trained from an early age in Western-style countries are Mozart (music), Von Neumann (math), John Stuart Mill (philosophy), and Judit Polgár (chess). In all these cases, they were gifted children who lived in normal nuclear families and had ambitious parents and extra tutoring.
In my opinion, much of the value of interpretability is not related to AI alignment but to AI capabilities evaluations instead.
For example, the Othello paper shows that a transformer trained on the next-word prediction of Othello moves learns a world model of the board rather than just statistics of the training text. This knowledge is useful because it suggests that transformer language models are more capable than they might initially seem.
Thanks for the post! It was a good read. One point I don’t think was brought up is the fact that chess is turn-based whereas real life is continuous.
Consequently, the huge speed advantage that AIs have is not that useful in chess because the AI still has to wait for you to make a move before it can move.
But since real life is continuous, if the AI is much faster than you, it could make 1000 ‘moves’ for every move you make and therefore speed is a much bigger advantage in real life.
Great post. I also fear that it may not be socially acceptable for AI researchers to talk about the long-term effects of AI despite the fact that, because of exponential progress, most of the impact of AI will probably occur in the long term.
I think it’s important that AI safety and considerations related to AGI become mainstream in the field of AI because it could be dangerous if the people building AGI are not safety-conscious.
I want a world where the people building AGI are also safety researchers rather than one where the AI researchers aren’t thinking about safety and the safety people are shouting over the wall and asking them to build safe AI.
This idea reminds me of how software development and operations were combined into the DevOps role in software companies.
I agree. GPT-4 is an AGI for the kinds of tasks I care about such as programming and writing. ChatGPT4 in its current form (with the ability to write and execute code) seems to be at the expert human level in many technical and quantitative subjects such as statistics and programming.
For example, last year I was amazed when I gave ChatGPT4 one of my statistics past exam papers and it got all the questions right except for one which involved interpreting an image of a linear regression graph. The questions typically involve understanding the question, thinking of an appropriate statistical method, and doing calculations to find the right answer. Here’s an example question:
Times (in minutes) for a sample of 8 players are presented in Table 1 below. Using an appropriate test at the 5% significance level, investigate whether there is evidence of a decrease in the players’ mean 5k time after the six weeks of training. State clearly your assumptions and conclusions, and report a p-value for your test statistic.
The solution to this question is a paired sample t-test.
Sure, GPT-4 has probably seen similar questions before but so do students since they can practice past papers.
This year, one of my professors designed his optimization assignment to be ChatGPT-proof but I found that it could still solve five out of six questions successfully. The questions involved converting natural language descriptions of optimization problems into mathematical formulations and solving them with a program.
One of the few times I’ve seen GPT-4 genuinely struggle to do a task is when I asked it to solve a variant of the Zebra Puzzle which is a challenging logical reasoning puzzle that involves updating a table based on limited information and using logical reasoning and a process of elimination to find the correct answer.
Context of the post: funding overhang
The post was written in 2021 and argued that there was a funding overhang in longtermist causes (e.g. AI safety) because the amount of funding had grown faster than the number of people working.
The amount of committed capital increased by ~37% per year and the amount of deployed funds increased by ~21% per year since 2015 whereas the number of engaged EAs only grew ~14% per year.
The introduction of the FTX Future Fund around 2022 caused a major increase in longtermist funding which further increased the funding overhang.
Benjamin linked a Twitter update in August 2022 saying that the total committed capital was down by half because of a stock market and crypto crash. Then FTX went bankrupt a few months later.
The current situation
The FTX Future Fund no longer exists and Open Phil AI safety spending seems to have been mostly flat for the past 2 years. The post mentions that Open Phil is doing this to evaluate impact and increase capacity before possibly scaling more.
My understanding (based on this spreadsheet) is that the current level of AI safety funding has been roughly the same for the past 2 years whereas the number of AI safety organizations and researchers has been increasing by ~15% and ~30% per year respectively. So the funding overhang could be gone by now or there could even be a funding underhang.
Comparing talent vs funding
The post compares talent and funding in two ways:
The lifetime value of a researcher (e.g. $5 million) vs total committed funding (e.g. $1 billion)
The annual cost of a researcher (e.g. $100k) vs annual deployed funding (e.g. $100 million)
A funding overhang occurs when the total committed funding is greater than the lifetime value of all the researchers or the annual amount of funding that could be deployed per year is greater than the annual cost of all researchers.
Then the post says:
“Personally, if given the choice between finding an extra person for one of these roles who’s a good fit or someone donating $X million per year, to think the two options were similarly valuable, X would typically need to be over three, and often over 10 (where this hugely depends on fit and the circumstances).”
I forgot to mention that this statement was applied to leadership roles like research leads, entrepreneurs, and grantmakers who can deploy large amounts of funds or have a large impact and therefore can have a large amount of value. Ordinary employees probably have less financial value.
Assuming there is no funding overhang in AI safety anymore, the marginal value of funding over more researchers is higher today than it was when the post was written.
The future
If total AI safety funding does not increase much in the near term, AI safety could continue to be funding-constrained or become more funding constrained as the number of people interested in working on AI safety increases.
However, the post explains some arguments for expecting EA funding to increase:
There’s some evidence that Open Philanthropy plans to scale up its spending over the next several years. For example, this post says, “We gave away over $400 million in 2021. We aim to double that number this year, and triple it by 2025”. Though the post was written in 2022 so it could be overoptimistic.
According to Metaculus, there is a ~50% chance of another Good Ventures / Open Philanthropy-sized fund being created by 2026 which could substantially increase funding for AI safety.
My mildly optimistic guess is that as AI safety becomes more mainstream there will be a symmetrical effect where both more talent and funding are attracted to the field.
Wow, this is an incredible achievement given how AI safety is still a relatively small field. For example, this post by 80,000 hours said that $10 - $50 million was spent globally on AI safety in 2020 according to The Precipice. Therefore this grant is roughly equivalent to an entire year of global AI safety funding!
No offense but I sense status quo bias in this post.
If you replace “AI” with “industrial revolution” I don’t think the meaning of the text changes much and I expect most people would rather live today than in the Middle Ages.
One thing that might be concerning is that older generations (us in the future) might not have the ability to adapt to a drastically different world in the same way that some old people today struggle to use the internet.
I personally don’t expect to be overly nostalgic in the future because I’m not that impressed by the current state of the world: factory farming, the hedonic treadmill, physical and mental illness, wage slavery, aging, and ignorance are all problems that I hope are solved in the future.