Last month we held a workshop on Post-AGI outcomes. This post is a list of all the talks, with short summaries, as well as my personal takeaways.
The first keynote was @Joe Carlsmith on “Can Goodness Compete?”. He asked: can anyone compete with “Locusts”: those who want to use all resources to replicate as fast as possible?
The second keynote was @Richard_Ngo on “Flourishing in a highly unequal world”. He argued that future beings will vary greatly in power and intelligence, so we should aim for “healthy asymmetric” relations, analogous to that between parent and child.
Morgan MacInnes of U Toronto Political Science spoke on “The history of technologically provoked welfare erosion”. His work with Allan Dafoe argued that competitive pressure sometimes forces states to treat their own citizens badly.
The next talk was a direct rebuttal to Morgan’s talk! Liam Patell of GovAI spoke on “Evolutionary Game Theory and the Structure of States”, arguing that if there are only two states, there is an equilibrium that maintains welfare:
Jacob Steinhardt, CEO of Transluce, spoke on “Post-AGI Game Theory”, i.e. how future AIs will influence their own development. He had a concrete proposal: flood the internet with high-quality examples of AI behavior acting on good values. The idea is that this data would make future LLMs more aligned by default. Kind of like reading kids moral fables?
Anna Yelizarova of the Windfall Trust spoke on “Scenario Planning for Transformative AI’s Economic Impact”. I.e. predicting where wealth might be concentrated and what empirical evidence might tell us where we’re headed.
Ryan Lowe of the Meaning Alignment Institute spoke on “Co-Aligning AI and Institutions”. Their Full-stack Alignment work argues that alignment strategy needs to consider the institutions in which AI is developed and deployed. He also mentioned “Thick Models of Value”, outlining the practical problems of specifying values through unstructured text (i.e. system prompts or constitutions) or preference orderings.
Steven Casper (aka Cas) of MIT and the UK AISI spoke on “Taking the Proliferation of Highly-Capable AI Seriously”. He discussed the consequences of very capable open-weight models, and what practices could reduce their dangers, even if they’re widely dispersed.
Tianyi Qui of Peking University spoke on spoke on “LLM-Mediated Cultural Feedback Loops”. They empirically studied “culture lock-in”, where LLM output affects human output and causes a feedback loop which locks in a certain belief, value or practice.
Beatrice Erkers of Existential Hope covered two near-future scenarios: 1) A tool-AI scenario, based on coordination to limit agentic AGI, and 2) A d/acc scenario, based on decentralized tech development.
Avid Ovadya of the AI & Democracy Foundation spoke on “Democratic Capabilities for Good AGI Equilibria”, i.e. how to upgrade our institutions to handle pressures from AI development. E.g. by having AI delegates or new coordination mechanisms.
Kirthana Singh Khurana of UBC Law spoke on “Corporations as Alignment Mechanism Laboratories”, making the case that we face similar alignment problems in aligning both corporations and AI to the public good. An audience suggestion after the talk was that we should study how corporations become misaligned from shareholders, and the various mechanisms and attempts made to stop this.
My Takeaways:
Historical parallels are good for getting people to think about the future in near mode. The world will change massively, but it’s not a total mystery what the main forces affecting the future are likely to be.
I wish we had prompted the speakers to articulate more precise hypotheses about the future, even if any particular one sounds implausible. I think at this stage brainstorming is useful, and speculation by experts is undersupplied (maybe because it looks relatively amateurish). Plus, I think this exercise would make it clearer to outsiders just how undeveloped thinking in this area is in general.
Still no one has proposed, in my mind, any especially plausible trajectory in which human interests are respected post-AGI! It’s not obvious we’re doomed, but the best plan still seems to be basically “ask AI what to do and hope property rights hold”.
One of my hopes with this workshop was that it would make more people realize that there is still basically no plan or positive vision out there. But I’m worried it’ll have the opposite effect, of making it seem like experts are calmly handling the situation.
On the other hand, I don’t think a conference focused on “there is no plan, should we freak out more?” would be productive, nor would it gather much attention. I’m all ears on how to thread this needle.
What’s next:
We’ll be hosting the next iteration of this workshop on Dec 3rd, co-located with NeurIPS in San Diego. This time it’ll be titled:
Post-AGI Culture, Economics, and Governance.
We already have a pretty good speaker lineup, including:
Summary of our Workshop on Post-AGI Outcomes
Last month we held a workshop on Post-AGI outcomes. This post is a list of all the talks, with short summaries, as well as my personal takeaways.
The first keynote was @Joe Carlsmith on “Can Goodness Compete?”. He asked: can anyone compete with “Locusts”: those who want to use all resources to replicate as fast as possible?
Longer version with transcript
The second keynote was @Richard_Ngo on “Flourishing in a highly unequal world”. He argued that future beings will vary greatly in power and intelligence, so we should aim for “healthy asymmetric” relations, analogous to that between parent and child.
Morgan MacInnes of U Toronto Political Science spoke on “The history of technologically provoked welfare erosion”. His work with Allan Dafoe argued that competitive pressure sometimes forces states to treat their own citizens badly.
The next talk was a direct rebuttal to Morgan’s talk! Liam Patell of GovAI spoke on “Evolutionary Game Theory and the Structure of States”, arguing that if there are only two states, there is an equilibrium that maintains welfare:
Jacob Steinhardt, CEO of Transluce, spoke on “Post-AGI Game Theory”, i.e. how future AIs will influence their own development. He had a concrete proposal: flood the internet with high-quality examples of AI behavior acting on good values. The idea is that this data would make future LLMs more aligned by default. Kind of like reading kids moral fables?
Anna Yelizarova of the Windfall Trust spoke on “Scenario Planning for Transformative AI’s Economic Impact”. I.e. predicting where wealth might be concentrated and what empirical evidence might tell us where we’re headed.
@Fazl of Oxford spoke on Resisting AI-Enabled Authoritarianism. Specifically, about which AI capabilities empower states versus citizens:
Ryan Lowe of the Meaning Alignment Institute spoke on “Co-Aligning AI and Institutions”. Their Full-stack Alignment work argues that alignment strategy needs to consider the institutions in which AI is developed and deployed. He also mentioned “Thick Models of Value”, outlining the practical problems of specifying values through unstructured text (i.e. system prompts or constitutions) or preference orderings.
Steven Casper (aka Cas) of MIT and the UK AISI spoke on “Taking the Proliferation of Highly-Capable AI Seriously”. He discussed the consequences of very capable open-weight models, and what practices could reduce their dangers, even if they’re widely dispersed.
Tianyi Qui of Peking University spoke on spoke on “LLM-Mediated Cultural Feedback Loops”. They empirically studied “culture lock-in”, where LLM output affects human output and causes a feedback loop which locks in a certain belief, value or practice.
Beatrice Erkers of Existential Hope covered two near-future scenarios:
1) A tool-AI scenario, based on coordination to limit agentic AGI, and
2) A d/acc scenario, based on decentralized tech development.
Avid Ovadya of the AI & Democracy Foundation spoke on “Democratic Capabilities for Good AGI Equilibria”, i.e. how to upgrade our institutions to handle pressures from AI development. E.g. by having AI delegates or new coordination mechanisms.
Kirthana Singh Khurana of UBC Law spoke on “Corporations as Alignment Mechanism Laboratories”, making the case that we face similar alignment problems in aligning both corporations and AI to the public good. An audience suggestion after the talk was that we should study how corporations become misaligned from shareholders, and the various mechanisms and attempts made to stop this.
My Takeaways:
Historical parallels are good for getting people to think about the future in near mode. The world will change massively, but it’s not a total mystery what the main forces affecting the future are likely to be.
I wish we had prompted the speakers to articulate more precise hypotheses about the future, even if any particular one sounds implausible. I think at this stage brainstorming is useful, and speculation by experts is undersupplied (maybe because it looks relatively amateurish). Plus, I think this exercise would make it clearer to outsiders just how undeveloped thinking in this area is in general.
Still no one has proposed, in my mind, any especially plausible trajectory in which human interests are respected post-AGI! It’s not obvious we’re doomed, but the best plan still seems to be basically “ask AI what to do and hope property rights hold”.
One of my hopes with this workshop was that it would make more people realize that there is still basically no plan or positive vision out there. But I’m worried it’ll have the opposite effect, of making it seem like experts are calmly handling the situation.
On the other hand, I don’t think a conference focused on “there is no plan, should we freak out more?” would be productive, nor would it gather much attention. I’m all ears on how to thread this needle.
What’s next:
We’ll be hosting the next iteration of this workshop on Dec 3rd, co-located with NeurIPS in San Diego. This time it’ll be titled: Post-AGI Culture, Economics, and Governance.
We already have a pretty good speaker lineup, including:
Max Tegmark, MIT & Future of Life Institute
Anton Korinek, University of Virginia (tentative)
Iason Gabriel, Deepmind
Alex Tamkin, Anthropic Societal Impacts Team
Anders Sandberg, Institute for Futures Studies
Ivan Vendrov, Midjourney
Ajeya Cotra, Open Philanthropy (tentative)
Michiel Bakker, Deepmind & MIT
Beren Millidge, Zyphra
Atoosa Kasirzadeh, Carnegie Mellon University
Deger Turan, Metaculus
Wil Cunningham, Deepmind & University of Toronto
We’re all ears for what this next event should focus on, advice on field-building, or things you’d like to see!