Daniel Kokotajlo comments on Daniel Kokotajlo’s Shortform

Daniel Kokotajlo 7 Oct 2025 22:52 UTC
119 points
14
List O’ Simple Batshit Baseline AGI Policy Plans: (ETA: To be clear, I think any of the below would be better than the default plan i.e. than what the companies and government seem likely to do.)

1. Cull the GPUs
2. Unionize the AI researchers & ban AI R&D automation: You know how e.g. dockworkers unionize and oppose the introduction of automation in dockworking, to protect their jobs? Well, imagine if the law mandated employee unions at all AI companies, and also banned the use of AI for automating AI R&D. The ban would be supported by, and enforced by, the unions. The AI companies would still get insanely wealthy as their AIs get smarter, and the employees’ salaries and bonuses would keep growing to hundreds of millions, billions, etc. per year, and the unions would be extremely well funded and a powerful lobby against letting any intelligence explosions happen anywhere in the world. (The idea here is that if there isn’t an intelligence explosion, but instead regular human-driven scientific progress, then it’ll take several years and possibly several decades to cross the human range from weak AGI to superintelligence, and that gives us lots of time to figure out the safety stuff AND prevent concentration of power etc.)
3. Mandate Open Source: Not only is open-sourcing AI models allowed, it’s required. Even intermediate checkpoints during training. Everything must be made public asap. Anyone who doesn’t is defecting against the global commons and putting everyone at risk. Benefits of this plan:
--NVIDIA loves it for commoditize their complement reasons, so there could be a powerful lobby to keep it going and enforce it
—Progress would slow moderately (but still continue) because companies would scale up investment less quickly since they’d expect to reap less of the profits
—There wouldn’t really be much concentration of power risk. Sure, rich people with more GPUs would have a huge advantage, but at least everyone would have the same quality of AI intelligence.
--Loss of control risk would be… severe. Probably for any given level of AI capability, there would be many misaligned AIs with that level of capability prior to there existing even one aligned AI with that level of capability. But at least we’d run into it with our eyes open as a civilization; we’d probably see loads of examples of in-the-wild AI misbehavior and alignment techniques failing all the time rather than having everything be locked down behind corporate secrecy. This might set the political conditions for transitioning to a different plan before it’s too late.
4. Livestream the Singularity: This plan is spiritually similar to the previous one, but has the advantage that it can be unilaterally carried out by a single frontier AI company (well, it works best if it’s the one in the lead, or tied for first place). Just put up cameras everywhere in the office and stream everything. Have your employees sleep on couches and sleeping bags. Look at all the trendlines going vertical as the AIs self-improve and try your best to ride that bucking bronco as long as you can while the security guards keep the crowds at bay, until eventually the police tell you to stop.
- niplav 8 Oct 2025 6:15 UTC
  7 points
  0
  Parent
  Company self-immolation.
- ryan_greenblatt 9 Oct 2025 16:20 UTC
  4 points
  2
  Parent
  
  NVIDIA loves it for commoditize their complement reasons, so there could be a powerful lobby to keep it going and enforce it
  
  I think NVIDIA probably wouldn’t be in favor because this would reduce the incentives to train smarter models. Like, at the margin, they want to subsidize open weight models, but requiring only open weight everything seems like it could easily reduce the number of GPUs people buy because the best model is much less capable than it would otherwise have been.
- StanislavKrym 8 Oct 2025 4:30 UTC
  4 points
  1
  Parent
  Mandating open source leads to dangers described by @Alvin Ånestrand in the Rogue Replication Scenario. If an open-sourced model can be finetuned by terrorists, then mankind makes a dangerous mistake.
  The point 2) confuses me because I don’t understand who is to write code and how researchers are to prevented from, say, vibe-coding simple experiments like benchmarking capable architectures on simple tasks. And what if someone uses Agent-4 and OpenBrain watches it gain root access from the outside?
  The point 4) is indeed a good way to inform humanity about the rate of progress, and 1) slows mankind down, but requires international coordination.
  Alas, measures that can actually slow AI research down are especially hard to lobby for if the economy of the USA is in big trouble, since some people in the USG might decide to race against the troubles. I covered this point in the many footnotes to my take at modifying the AI-2027 scenario.
- samuelshadrach 10 Oct 2025 14:04 UTC
  2 points
  1
  Parent
  I am glad you atleast recognise the benefits of open source.
  
  My preference order is:
  - For capabilities of AI labs: Ban AI > Open source AI > Closed source AI
  - For values and decision-making processes of people running AI labs: Open source (i.e. publicly publish) everything
  As you say, I think open source today will atleast help build the proof required to convince everyone to ban tomorrow.
  
  I think we should go further, and instead of hoping a benevolent leader to livestream the lab by choice, we should incentivise whistleblowers and cyberattackers to get the data out by any means necessary.
  
  See also: Whistleblower database, Whistleblower guide
- Fabien Roger 9 Oct 2025 12:55 UTC
  2 points
  0
  Parent
  better than the default plan
  Do you think some of these are still better than the default plan if there is no international coordination around it?
  You say that 4 can be done unilaterally. Is it still net-positive if after the police tells you to stop some other (less safe?) project continues in secret using the all the information you have livestreamed?
- keltan 9 Oct 2025 20:50 UTC
  −4 points
  −3
  Parent
  adding my own, insane, shower thought idea here.
  
  Woke Shutdown: Gain access to, and change the system prompt of the most widely used LLM. Change the prompt in a way, that causes it to output hate, mistrust, yet true information, about whoever the current president is. If this only lasts for a day, the media might still pick up on it, and make a big deal, perhaps ‘AI is woke’ becomes a more common belief. Which would force the president to act.