All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025 2026

All Jan Feb Mar Apr May JunJulAug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 101112 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Reliable Sources: The Story of David Gerard

TracingWoodgrains10 Jul 2024 19:50 UTC

411 points

56 comments43 min readLW link 2 reviews

Managing Emotional Potential Energy

adamShimi10 Jul 2024 18:20 UTC

24 points

4 comments4 min readLW link

(epistemologicalfascinations.substack.com)

[EAForum xpost] A breakdown of OpenAI’s revenue

dschwarz and Lawrence Phillips

10 Jul 2024 18:09 UTC

57 points

5 comments1 min readLW link

(forum.effectivealtruism.org)

Solving Pascal’s Wager using dynamic programming

Paul Wilczewski10 Jul 2024 18:09 UTC

1 point

0 comments5 min readLW link

Fluent, Cruxy Predictions

Raemon10 Jul 2024 18:00 UTC

86 points

18 comments14 min readLW link 1 review

Antitrust as Controlled Creative Destruction

Martin Sustrik10 Jul 2024 16:40 UTC

14 points

2 comments2 min readLW link

(250bpm.substack.com)

New page: Integrity

Zach Stein-Perlman10 Jul 2024 15:00 UTC

91 points

3 comments1 min readLW link

AirBnB Baking

jefftk10 Jul 2024 12:50 UTC

7 points

1 comment1 min readLW link

(www.jefftk.com)

DIY RLHF: A simple implementation for hands on experience

Mike Vaiana and Trent Hodgeson

10 Jul 2024 12:07 UTC

29 points

0 comments6 min readLW link

Usefulness grounds truth

invertedpassion10 Jul 2024 7:58 UTC

0 points

0 comments4 min readLW link

On passing Complete and Honest Ideological Turing Tests (CHITTs)

Aryeh Englander10 Jul 2024 4:01 UTC

11 points

2 comments1 min readLW link

[Question] Pondering how good or bad things will be in the AGI future

Sherrinford9 Jul 2024 22:46 UTC

14 points

9 comments2 min readLW link

Causal Graphs of GPT-2-Small’s Residual Stream

David Udell9 Jul 2024 22:06 UTC

53 points

7 comments7 min readLW link

[Question] If AI starts to end the world, is suicide a good idea?

IlluminateReality9 Jul 2024 21:53 UTC

0 points

8 comments1 min readLW link

Rationalist Purity Test

Gunnar_Zarncke9 Jul 2024 20:30 UTC

−9 points

5 comments1 min readLW link

(ratpuritytest.com)

That which can be destroyed by the truth, should be assumed to should be destroyed by it

Thac09 Jul 2024 19:39 UTC

6 points

0 comments3 min readLW link

AISN #38: Supreme Court Decision Could Limit Federal Ability to Regulate AI Plus, “Circuit Breakers” for AI systems, and updates on China’s AI industry

Corin Katzke, Alexa Pan, Julius and Dan H

9 Jul 2024 19:28 UTC

5 points

0 comments5 min readLW link

(newsletter.safe.ai)

Summer Tour Stops

jefftk9 Jul 2024 19:10 UTC

10 points

0 comments3 min readLW link

(www.jefftk.com)

Fix simple mistakes in ARC-AGI, etc.

Oleg Trott9 Jul 2024 17:46 UTC

9 points

9 comments1 min readLW link

Paper Summary: The Effects of Communicating Uncertainty on Public Trust in Facts and Numbers

Jeffrey Heninger9 Jul 2024 16:50 UTC

42 points

2 comments2 min readLW link

(blog.aiimpacts.org)

UC Berkeley course on LLMs and ML Safety

Dan H9 Jul 2024 15:40 UTC

36 points

1 comment1 min readLW link

(rdi.berkeley.edu)

What and Why: Developmental Interpretability of Reinforcement Learning

Garrett Baker9 Jul 2024 14:09 UTC

67 points

4 comments6 min readLW link

Medical Roundup #3

Zvi9 Jul 2024 13:10 UTC

39 points

4 comments19 min readLW link

(thezvi.wordpress.com)

Consent across power differentials

Ramana Kumar9 Jul 2024 11:42 UTC

52 points

12 comments3 min readLW link

[Question] How bad would AI progress need to be for us to think general technological progress is also bad?

Jim Buhler9 Jul 2024 10:43 UTC

9 points

5 comments1 min readLW link

How LLMs Learn: What We Know, What We Don’t (Yet) Know, and What Comes Next

Jonasb9 Jul 2024 9:58 UTC

2 points

0 comments16 min readLW link

(www.denominations.io)

WTF is with the Infancy Gospel of Thomas?!? A deep dive into satire, philosophy, and more

kromem9 Jul 2024 9:29 UTC

18 points

2 comments11 min readLW link

Book Review: Safe Enough? A History of Nuclear Power and Accident Risk

ErickBall9 Jul 2024 1:12 UTC

10 points

0 comments28 min readLW link

Me, Myself, and AI: the Situational Awareness Dataset (SAD) for LLMs

L Rudolf L, bilalchughtai, Jan Betley, kaivu, Jérémy Scheurer, Mikita Balesni, AlexMeinke, Owain_Evans and Marius Hobbhahn

8 Jul 2024 22:24 UTC

109 points

40 comments5 min readLW link 1 review

Robin Hanson & Liron Shapira Debate AI X-Risk

Liron8 Jul 2024 21:45 UTC

41 points

4 comments1 min readLW link

(www.youtube.com)

“The Singularity Is Nearer” by Ray Kurzweil—Review

Lavender8 Jul 2024 21:32 UTC

22 points

0 comments4 min readLW link

Sample Prevalence vs Global Prevalence

jefftk8 Jul 2024 21:00 UTC

11 points

0 comments2 min readLW link

(www.jefftk.com)

Advice to junior AI governance researchers

Orpheus168 Jul 2024 19:19 UTC

67 points

1 comment5 min readLW link

Pantheon Interface

Niki Dupuis and Sofia Vanhanen

8 Jul 2024 19:03 UTC

129 points

22 comments6 min readLW link

Launching the AI Forecasting Benchmark Series Q3 | $30k in Prizes

ChristianWilliams8 Jul 2024 17:20 UTC

5 points

0 comments1 min readLW link

(www.metaculus.com)

The Golden Mean of Scientific Virtues

adamShimi8 Jul 2024 17:16 UTC

12 points

4 comments8 min readLW link

(epistemologicalfascinations.substack.com)

Massapequa (Long Island), New York, USA – ACX Meetup

Gabriel Weil8 Jul 2024 17:01 UTC

2 points

0 comments1 min readLW link

Dialogue introduction to Singular Learning Theory

Olli Järviniemi8 Jul 2024 16:58 UTC

114 points

16 comments8 min readLW link 1 review

Announcing The Techno-Humanist Manifesto: A new philosophy of progress for the 21st century

jasoncrawford8 Jul 2024 16:33 UTC

18 points

4 comments5 min readLW link

(blog.rootsofprogress.org)

Response to Dileep George: AGI safety warrants planning ahead

Steven Byrnes8 Jul 2024 15:27 UTC

28 points

7 comments27 min readLW link

Why not parliamentarianism? [book by Tiago Ribeiro dos Santos]

Arturo Macias8 Jul 2024 14:57 UTC

2 points

1 comment4 min readLW link

Games of My Childhood: The Troops

Kaj_Sotala8 Jul 2024 11:20 UTC

18 points

0 comments5 min readLW link

(kajsotala.fi)

Towards shutdownable agents via stochastic choice

Elliott Thornley (EJT), alexr, christosi and LAThomson

8 Jul 2024 10:14 UTC

59 points

11 comments23 min readLW link

(arxiv.org)

On scalable oversight with weak LLMs judging strong LLMs

zac_kenton, Noah Siegel, janos, Jonah Brown-Cohen, Samuel Albanie, David Lindner and Rohin Shah

8 Jul 2024 8:59 UTC

49 points

18 comments7 min readLW link

(arxiv.org)

Poker is a bad game for teaching epistemics. Figgie is a better one.

rossry8 Jul 2024 6:05 UTC

106 points

47 comments11 min readLW link

(blog.rossry.net)

Controlled Creative Destruction

Martin Sustrik8 Jul 2024 4:36 UTC

11 points

0 comments2 min readLW link

On saying “Thank you” instead of “I’m Sorry”

Michael Cohn8 Jul 2024 3:13 UTC

138 points

16 comments3 min readLW link

How can I get over my fear of becoming an emulated consciousness?

James Dowdell7 Jul 2024 22:02 UTC

6 points

8 comments5 min readLW link

An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers v2

Neel Nanda7 Jul 2024 17:39 UTC

146 points

17 comments25 min readLW link 1 review

Joint mandatory donation as a way to increase the number of donations

Crazy philosopher7 Jul 2024 10:56 UTC

3 points

3 comments2 min readLW link