Co-lead PauseAI Germany, Member Torchbearer community, Master Computer Science, PHD candidate bioinformatics, transitioning to AI safety Governance&Stratety,
Benjamin Schmidt
A few reasons to stop right now:
Each new model shortens the path to ASI, makes a pause more restrictive and difficult.
There might be a huge gradual disempowerment overhang already which will continue to grow during the next months/years. Adoption, societal norm changing, other technology like robots, drones, self-driving cars and smart glasses and … and unknown unknowns are all lacking behind and thus GD is lacking behind. If we continue we might already end up in a bad GD future.
There is little overhang but the concentration of power and gradual disempowerment might make a pause more difficult
Don’t trust companies, countries to actually pause in the future.
I am curious what you think in the Gradual Disempowerment (GD) direction
Edit: Kaj’s answer wasn’t there yet probably some stuff is double.
Prayer seems like a mix of meditation and some other practices to me. Probably works for the same reason as those practices. As for how they work? No idea. People are trying to figure it out but it will take a while.
I think it’s more interesting to first figure out whether they do something at all. For that I would focus on getting the most experienced long term practitioner and check the phenomenology they report. These are often far from Placebo effects.
Ignore all their meta-physical and most of the how it works claims, just the phenomenology of them and if they are treating/working with someone their patients. At the same time record a bunch of data on their bodies etc. You can e.g. look into the studies on cessations of consciousness which have come out over the last 3 years (https://www.biorxiv.org/content/10.64898/2026.02.10.705005v1)
There is probably a lot of people out there praying who get some small benefits and then there are some few people who can just think of god and enter a state with similar phenomenology to MDMA. I would start with those.
What is your p-doom developing ASI/AGI with anything like the current methods in the current environment? My assumption is that you just think the technical problem is getting solved.
I think you mentioned most of the things I would have mentioned from my impression of Goenka from the outside: Especially not reasonable according to rationalist/post-rationalist standards, caught up in doctrine. Also, kind of narrow meditation experience/expertise because they are stuck in one tradition with doctrine.
I think the combination of some Vispassana and no good maps/letting people hardcore meditate for 10 days without telling them about most possible negative effects is pretty irresponsible. The Dark Night of the Soul after A&P seems to be a thing at least for some percentage of people, psychosis, trauma reactions …
I am curious if they mentioned A&P and Vispassana Nanas did they mention the Nanas of suffering (5-10)?
For a good teacher I would e.g. recommend Roger Thisdell, you can check him out through a weekly session from his patreon.
If you haven’t read it I can recommend meditationbook.page and or MCTB.
Awesome idea. I think this is one of the ways AI can! improve our society.
I recently tried creating something like this for Astralcodex comments.
There were quite a few commenters who wrote 90% incorrect things which were annoying to check. For these this feels very helpful. For more nuanced disagreements maybe something like providing context or nothing! would be better.
I was struggling with finding the right prompt and how much compute to put into checking things.The entire post is investigated in a single agentic call. Rather than extracting claims first and investigating them individually, we send the full post text and ask the model to identify claims, investigate them, and return structured output mapping each verdict to a specific text span. This gives the model full context (a claim’s meaning often depends on surrounding paragraphs) and reduces round-trips.
Right now this approach will give AI Slop/bad results both on the false/true positive side. If I wanted an AI to do this I would at least ask it to create a list of statements, use a sub-agent for each statement to check all of them in the context of the whole. Have another subagent double check for each comment. Possibly with different prompts and a bias towards false negatives. I know this is more expensive but the way it is right now it’s essentially bad to look at the output. (If you chose to answer to one part of this comment, this seems like the most important).
Which prompt to use (Different prompts will provide very different results which also hints at underlying issues which still exist)?
Try to get a high percentage of true corrections, false positives here can be bad.
To get a good fact check right now you need a lot of compute. Maybe people could chose to run different levels? Maybe people could send money back to the person who originally checked it? There could even be the possibility of making profit.
It doesn’t work well for cutting edge ideas especially since AIs are still often overconfident.
Which writing style? I think everyone reading the same default AI writing style is at least bad, possibly quite bad, so switching writing style between comments might be important.
Bad 2nd order effects of everyone converging to the LLM epistemology or something else.
For someone like me who uses a subagent with a specific prompt to check every comment it might be good while maybe the default user will just get whatever XAI or Meta want them to read? I am unsure whether for society this will go into a good direction/how to get it there. Similar as twitter can be great for some people but is terrible or most of society.
PauseAI, ControlAI, Torchbearers are somewhat in contact/cooperating. I don’t know the details for Global and less for US.
PauseAI is definitely trying to and talking to politicians. For the German chapter we have at least 3 conversations with German MPs this month and one of our ex-coleads works for ControlAI in Germany. At Pausecon in Europe PauseAI also just did a briefing for several EU MPs in the European parliament and gave a draft for a resolution. The hosting EMP started the meeting off with a quote from Terminator. It was not a joke.The system goes online August 4th, 1997. Human decisions are removed from strategic defense. Skynet begins to learn at a geometric rate. It becomes self-aware at 2:14 a.m. Eastern time, August 29th. In a panic, they try to pull the plug.
Please contact at least your local policymaker and get a meeting with them (your chances there are way higher). Torchbearers also just started a training program for this: https://www.diptraining.org/ or you could join PauseAI.
Offer to anyone reading this: If you give me your available times, name etc. I will literally organize the meeting for you. You just have to go there.
Most people/policymakers e.g. have no idea that AI developers don’t really understand their systems. They also have no idea about what scientists or the AI company CEOs think or say or what the current models can do.
The average model of AI might be something like this: AIs are hallucinating a lot, are programmed like a normal program which people perfectly understand, can be controlled, didn’t get much better the last few years.
You could join Torchbearers (https://www.torchbearer.community/) or PauseAI to inform policymakers about what’s actually going on.
Because of that, people who prevent bad outcomes often get treated as though they’ve done nothing, or even as though they were dramatic for worrying. Which is a pretty fucked up reward structure when you think about it.
This is more towards the personal for someone who does the work vs how society should act towards them:
The master does nothing yet leaves nothing undone.” (Tao Te Ching)
One interpretation: No one knows you saved the world or everyone thinks they did it themselves.
You have a right to perform your prescribed duties, but you are not entitled to the fruits of your actions. Never consider yourself to be the cause of the results of your activities, nor be attached to inaction. (Bhagavad Gita: Chapter 2, Verse 47)
When people trying to save the world are working for the recognition/results they quickly start goodhearting.
PS: Love the post title.
Awesome work. For anyone reading this: Please try to talk to your local policymakers. As a constituent it is much easier to get a meeting.
Torchbearers (which is a community adjacent to ControlAI) and/or PauseAI are happy to give you advice or coaching!https://www.diptraining.org/
https://pauseai.info/join
I have goals that benefit from having hundreds of millions to billions of dollars. So do other people. Money is for steering the world. I can use money to hire other people and get them to do things I want.
How do you stand towards pluralism or democracy? There is some tension there with people having billions of dollars of steering influence but of course there is with taking peoples money away as well. Money could also be used to steer towards a more pluralistic society etc. …
Anything to read which approximately describes you view there?
Well, they’ll probably still exist.
It seems more likely to me that the Malawi people and everyone else will be killed at some point.
Currently the system still has to consider popular opinion to some degree. Killing all the Malavi people would not be efficient right now. When that incentive disappears enough, I would expect everyone else to get eliminated (All of this assumes incentive aligned AI which I wouldn’t expect). This goes in the direction the author was mentioning that the reason for moral progress is more about which societal structures are efficient not actual “moral” progress”.
If competition between humans persists I would expect the other last humans to disappear as well having to transfer all their power to survive the competition.
Overview: AI Safety Outreach Grassroots Orgs
I don’t think that effective politics in this case requires deception and deception often backfires in unexpected ways.
Gabriel and Connor suggest in their interview that radical honesty—genuinely trusting politicians, advisors and average people to understand your argument and recognizing that they also don’t want to die from ASI—can be remarkably effective. The real problem may be that this approach is not attempted enough. I remember this as a slightly less but still positive datapoint https://www.lesswrong.com/posts/2sLwt2cSAag74nsdN/speaking-to-congressional-staffers-about-ai-risk .
> If they have good political instincts, they’d probably have no desire to.
I can see that critique. I can also see something in the opposite direction where there is a giant “ugh field” around politics which we can dissolve for a lot of people who could be active in the space. We can both be honest and effective.
Luckily, we are in a world where most people already don’t like AI, they don’t like transhumanist ideas where they get killed to be replaced by AI, they don’t like to get killed in an ASI race, and they instinctively think intelligence smarter than them is dangerous.
Building the necessary coordination becomes significantly harder when deception is involved. MIRI’s public strategy as a reference:
1. Many other organizations are attempting the coalition-building, horse-trading, pragmatic approach. In private, many of the people who work at those organizations agree with us, but in public, they say the watered-down version of the message. We think there is a void at the candid end of the communication spectrum that we are well positioned to fill.
2. We think audiences are numb to politics as usual. They know when they’re being manipulated. We have opted out of the political theater, the [kayfabe](https://en.wikipedia.org/wiki/Kayfabe), with all its posing and posturing. We are direct and blunt and honest, and we come across as exactly what we are.
3. Probably most importantly, we believe that “pragmatic” political speech won’t get the job done. The political measures we’re asking for are a big deal; nothing but the full unvarnished message will motivate the action that is required.
Why does LW not put much more focus on AI governance and outreach?
Leipzig – ACX Meetups Everywhere Spring 2025
If minks are such a danger could we just make mink farming illegal?
The dangers absolutely don’t seem worth the gains.
However, part of that was about going from open research to closed.
Because of the strange loopy nature of concepts/language/self/different problems metaphilosophy seems unsolvable?
Asking: What is good? already implies that there are the concepts “good”, “what”, “being” that there are answers and questions … Now we could ask what concepts or questions to use instead …
Similarly:
> “What are all the things we can do with the things we have and what decision-making process will we use and why use that process if the character of the different processes is the production of different ends; don’t we have to know which end is desired in order to choose the decision-making process that also arrives at that result?”
> Which leads back to desire and knowing what you want without needing a system to tell you what you want.
It’s all empty in the Buddhist sense. It all depends on which concepts or turing machines or which physical laws you start with.
Because of the strange loopy nature of concepts/language/self/different problems metaphilosophy seems unsolvable?
Asking: What is good? already implies that there are the concepts “good”, “what”, “being” that there are answers and questions … Now we could ask what concepts or questions to use instead …
Similarly:
> “What are all the things we can do with the things we have and what decision-making process will we use and why use that process if the character of the different processes is the production of different ends; don’t we have to know which end is desired in order to choose the decision-making process that also arrives at that result?”
> Which leads back to desire and knowing what you want without needing a system to tell you what you want.
It’s all empty in the Buddhist sense. It all depends on which concepts or turing machines or which physical laws you start with.
We haven’t yet figured out how to find a good equilibrium with humans in control given humans not doing the work (https://gradual-disempowerment.ai/). If humans were ever in control https://www.lesswrong.com/posts/kbezWvZsMos6TSyfj/the-eldritch-in-the-21st-century .
The biggest problem with gradual disempowerment is that we want it.
Ignoring all of that, I would try to avoid as many sign of immoral mazeness as possible!