All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

All Jan FebMarApr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 192021 22 23 24 25 26 27 28 29 30 31

Controlling AGI Risk

TeaSeaMar 15, 2024, 4:56 AM

6 points

8 comments4 min readLW link

Ulm, Germany—ACX Spring Meetups Everywhere 2024

Benjamin RMar 15, 2024, 1:32 AM

2 points

1 comment1 min readLW link

Newport News/ Virginia ACX Meetup

DanielMar 14, 2024, 11:46 PM

1 point

0 comments1 min readLW link

Constructive Cauchy sequences vs. Dedekind cuts

jessicataMar 14, 2024, 11:04 PM

47 points

23 comments4 min readLW link

(unstableontology.com)

A Nail in the Coffin of Exceptionalism

Yeshua GodMar 14, 2024, 10:41 PM

−17 points

0 comments3 min readLW link

Toward a Broader Conception of Adverse Selection

Ricki HeicklenMar 14, 2024, 10:40 PM

177 points

61 comments13 min readLW link

(bayesshammai.substack.com)

More people getting into AI safety should do a PhD

AdamGleaveMar 14, 2024, 10:14 PM

61 points

24 comments12 min readLW link

(gleave.me)

Collection (Part 6 of “The Sense Of Physical Necessity”)

LoganStrohlMar 14, 2024, 9:37 PM

28 points

0 comments8 min readLW link

Fixed point or oscillate or noise

lemonhopeMar 14, 2024, 6:37 PM

3 points

10 comments1 min readLW link

How useful is “AI Control” as a framing on AI X-Risk?

habryka and ryan_greenblatt

Mar 14, 2024, 6:06 PM

70 points

4 comments34 min readLW link

Sparse autoencoders find composed features in small toy models

Evan Anders, Clement Neo, Jason Hoelscher-Obermaier and Jessica N. Howard

Mar 14, 2024, 6:00 PM

33 points

12 comments15 min readLW link

AI #55: Keep Clauding Along

ZviMar 14, 2024, 3:40 PM

62 points

16 comments70 min readLW link

(thezvi.wordpress.com)

To the average human, controlled AI is just as lethal as ‘misaligned’ AI

YonatanKMar 14, 2024, 2:52 PM

6 points

20 comments5 min readLW link

Claude vs GPT

Maxwell TabarrokMar 14, 2024, 12:41 PM

12 points

2 comments2 min readLW link

(www.maximum-progress.com)

A brief review of China’s AI industry and regulations

Elliot MckernonMar 14, 2024, 12:19 PM

24 points

0 comments16 min readLW link

[Question] Can any LLM be represented as an Equation?

Valentin BaltadzhievMar 14, 2024, 9:51 AM

1 point

2 comments1 min readLW link

‘Empiricism!’ as Anti-Epistemology

Eliezer YudkowskyMar 14, 2024, 2:02 AM

171 points

92 comments25 min readLW link

How I turned doing therapy into object-level AI safety research

ChipmonkMar 14, 2024, 1:54 AM

15 points

5 comments4 min readLW link

Opportunistic Time-Management

Richard HenageMar 13, 2024, 9:38 PM

13 points

2 comments1 min readLW link

AI governance and strategy: a list of research agendas and work that could be done.

NathanBarnard and Erin Robertson

Mar 13, 2024, 9:23 PM

7 points

1 comment17 min readLW link

Highlights from Lex Fridman’s interview of Yann LeCun

Joel BurgetMar 13, 2024, 8:58 PM

48 points

15 comments41 min readLW link

On the Latest TikTok Bill

ZviMar 13, 2024, 6:50 PM

58 points

7 comments29 min readLW link

(thezvi.wordpress.com)

[Question] Recommended book for a balanced take and lessons learned from covid pandemic response

Martin Hare RobertsonMar 13, 2024, 6:14 PM

4 points

0 comments1 min readLW link

ACX/LW Seattle spring meetup 2024

nsokolskyMar 13, 2024, 5:24 PM

12 points

3 comments1 min readLW link

Laying the Foundations for Vision and Multimodal Mechanistic Interpretability & Open Problems

Sonia Joseph and Neel Nanda

Mar 13, 2024, 5:09 PM

44 points

13 comments14 min readLW link

I was raised by devout Mormons, AMA [&|] Soliciting Advice

ErioirEMar 13, 2024, 4:52 PM

31 points

41 comments2 min readLW link

Relational Agency: Consistently Reaching Out

Jonathan MoregårdMar 13, 2024, 2:34 PM

16 points

0 comments5 min readLW link

(open.substack.com)

[Question] What could a policy banning AGI look like?

TsviBTMar 13, 2024, 2:19 PM

78 points

23 comments3 min readLW link

Clickbait Soapboxing

DaystarEldMar 13, 2024, 2:09 PM

24 points

16 comments3 min readLW link

(daystareld.com)

Virtual AI Safety Unconference 2024

Orpheus, Linda Linsefors, Joe Rogero, Arjun Yadav and Manuela García

Mar 13, 2024, 1:54 PM

14 points

0 comments1 min readLW link

Jobs, Relationships, and Other Cults

Ruby and Elizabeth

Mar 13, 2024, 5:58 AM

40 points

9 comments35 min readLW link

How do you improve the quality of your drinking water?

Alex K. Chen (parrot)Mar 13, 2024, 12:37 AM

11 points

2 comments1 min readLW link

The Parable Of The Fallen Pendulum—Part 2

johnswentworthMar 12, 2024, 9:41 PM

78 points

8 comments4 min readLW link

Open consultancy: Letting untrusted AIs choose what answer to argue for

Fabien RogerMar 12, 2024, 8:38 PM

35 points

5 comments5 min readLW link

[Question] Is anyone working on formally verified AI toolchains?

metachiralityMar 12, 2024, 7:36 PM

17 points

4 comments1 min readLW link

Transformer Debugger

Henk TillmanMar 12, 2024, 7:08 PM

26 points

0 comments1 min readLW link

(github.com)

Superforecasting the Origins of the Covid-19 Pandemic

DanielFilanMar 12, 2024, 7:01 PM

64 points

0 comments1 min readLW link

(goodjudgment.substack.com)

minimum viable action

Sindhu PrasadMar 12, 2024, 4:06 PM

1 point

0 comments3 min readLW link

Hardball questions for the Gemini Congressional Hearing

Michael ThiessenMar 12, 2024, 3:27 PM

−11 points

2 comments1 min readLW link

OpenAI: The Board Expands

ZviMar 12, 2024, 2:00 PM

92 points

1 comment30 min readLW link

(thezvi.wordpress.com)

Update on Developing an Ethics Calculator to Align an AGI to

sweenesmMar 12, 2024, 12:33 PM

4 points

2 comments8 min readLW link

[Question] How do you identify and counteract your biases in decision-making?

warrenjordanMar 12, 2024, 5:01 AM

2 points

1 comment1 min readLW link

How Much Have I Been Playing?

jefftkMar 12, 2024, 2:10 AM

9 points

0 comments1 min readLW link

(www.jefftk.com)

Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought

Miles TurpinMar 11, 2024, 11:46 PM

16 points

0 comments1 min readLW link

(arxiv.org)

AI Safety Action Plan—A report commissioned by the US State Department

agucovaMar 11, 2024, 10:14 PM

22 points

1 comment LW link

(www.gladstone.ai)

A discussion of AI risk and the cost/benefit calculation of stopping or pausing AI development

DuncanFowlerMar 11, 2024, 9:41 PM

1 point

0 comments1 min readLW link

Among the A.I. Doomsayers—The New Yorker

agucovaMar 11, 2024, 9:35 PM

12 points

1 comment LW link

(www.newyorker.com)

Be More Katja

Nathan YoungMar 11, 2024, 9:12 PM

53 points

0 comments3 min readLW link

AI Incident Reporting: A Regulatory Review

Deric Cheng and Elliot Mckernon

Mar 11, 2024, 9:03 PM

16 points

0 comments6 min readLW link

Results from an Adversarial Collaboration on AI Risk (FRI)

Josh Rosenberg, AvitalM, Molly and rosehadshar

Mar 11, 2024, 8:00 PM

61 points

3 comments9 min readLW link

(forecastingresearch.org)