All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All Jan Feb Mar AprMayJun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 91011 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Research Report: Incorrectness Cascades (Corrected)

Robert_AIZI9 May 2023 21:54 UTC

9 points

0 comments9 min readLW link

(aizi.substack.com)

Stopping dangerous AI: Ideal US behavior

Zach Stein-Perlman9 May 2023 21:00 UTC

17 points

0 comments3 min readLW link

Stopping dangerous AI: Ideal lab behavior

Zach Stein-Perlman9 May 2023 21:00 UTC

8 points

0 comments2 min readLW link

Progress links and tweets, 2023-05-09

jasoncrawford9 May 2023 20:22 UTC

14 points

0 comments2 min readLW link

(rootsofprogress.org)

[Question] Have you heard about MIT’s “liquid neural networks”? What do you think about them?

Ppau9 May 2023 20:16 UTC

35 points

14 comments1 min readLW link

Respect for Boundaries as non-arbirtrary coordination norms

Jonas Hallgren9 May 2023 19:42 UTC

9 points

3 comments7 min readLW link

Solving the Mechanistic Interpretability challenges: EIS VII Challenge 1

StefanHex and Marius Hobbhahn

9 May 2023 19:41 UTC

119 points

1 comment10 min readLW link

Forecasting as a tool for teaching the general public to make better judgements?

Dominik Hajduk | České priority9 May 2023 17:35 UTC

3 points

0 comments3 min readLW link

Language models can explain neurons in language models

nz9 May 2023 17:29 UTC

23 points

0 comments1 min readLW link

(openai.com)

Asimov on building robots without the First Law

rossry9 May 2023 16:44 UTC

4 points

1 comment2 min readLW link

Making Up Baby Signs

jefftk9 May 2023 16:40 UTC

44 points

6 comments2 min readLW link

(www.jefftk.com)

Exciting New Interpretability Paper!

research_prime_space9 May 2023 16:39 UTC

12 points

1 comment1 min readLW link

Result Of The Bounty/Contest To Explain Infra-Bayes In The Language Of Game Theory

johnswentworth9 May 2023 16:35 UTC

81 points

0 comments1 min readLW link

The Bleak Harmony of Diets and Survival: A Glimpse into Nature’s Unforgiving Balance

bardstale9 May 2023 16:08 UTC

−16 points

0 comments1 min readLW link

Entropic Abyss

bardstale9 May 2023 15:59 UTC

−12 points

0 comments2 min readLW link

AI Safety Newsletter #5: Geoffrey Hinton speaks out on AI risk, the White House meets with AI labs, and Trojan attacks on language models

Dan H and Orpheus16

9 May 2023 15:26 UTC

31 points

1 comment4 min readLW link

(newsletter.safe.ai)

A Search for More ChatGPT / GPT-3.5 / GPT-4 “Unspeakable” Glitch Tokens

Martin Fell9 May 2023 14:36 UTC

26 points

9 comments6 min readLW link

How to Interpret Prediction Market Prices as Probabilities

SimonM9 May 2023 14:12 UTC

14 points

1 comment4 min readLW link

Stampy’s AI Safety Info—New Distillations #2 [April 2023]

markov9 May 2023 13:31 UTC

25 points

1 comment1 min readLW link

(aisafety.info)

Quote quiz answer

jasoncrawford9 May 2023 13:27 UTC

19 points

0 comments4 min readLW link

(rootsofprogress.org)

[Question] Does reversible computation let you compute the complexity class PSPACE as efficiently as normal computers compute the complexity class P?

Noosphere899 May 2023 13:18 UTC

6 points

14 comments1 min readLW link

EconTalk podcast: “Eliezer Yudkowsky on the Dangers of AI”

TekhneMakre9 May 2023 11:14 UTC

15 points

1 comment1 min readLW link

(www.econtalk.org)

Most people should probably feel safe most of the time

Kaj_Sotala9 May 2023 9:35 UTC

96 points

28 comments10 min readLW link

Summaries of top forum posts (1st to 7th May 2023)

Zoe Williams9 May 2023 9:30 UTC

21 points

0 comments11 min readLW link

Focusing on longevity research as a way to avoid the AI apocalypse

Random Trader9 May 2023 4:47 UTC

14 points

2 comments2 min readLW link

When is Goodhart catastrophic?

Drake Thomas and Thomas Kwa

9 May 2023 3:59 UTC

190 points

30 comments8 min readLW link 1 review

Chilean AIS Hackathon Retrospective

agucova9 May 2023 1:34 UTC

9 points

0 comments5 min readLW link

Announcing “Key Phenomena in AI Risk” (facilitated reading group)

Nora_Ammann and particlemania

9 May 2023 0:31 UTC

64 points

4 comments2 min readLW link

Yoshua Bengio argues for tool-AI and to ban “executive-AI”

habryka9 May 2023 0:13 UTC

53 points

15 comments7 min readLW link

(yoshuabengio.org)

South Bay ACX/LW Meetup

IS8 May 2023 23:55 UTC

2 points

0 comments1 min readLW link

H-JEPA might be technically alignable in a modified form

Roman Leventov8 May 2023 23:04 UTC

12 points

2 comments7 min readLW link

All AGI Safety questions welcome (especially basic ones) [May 2023]

steven04618 May 2023 22:30 UTC

34 points

44 comments2 min readLW link

Predictable updating about AI risk

Joe Carlsmith8 May 2023 21:53 UTC

297 points

25 comments36 min readLW link 1 review

Annotated reply to Bengio’s “AI Scientists: Safe and Useful AI?”

Roman Leventov8 May 2023 21:26 UTC

18 points

2 comments7 min readLW link

(yoshuabengio.org)

Are healthy choices effective for improving live expectancy anymore?

Christopher King8 May 2023 21:25 UTC

4 points

4 comments1 min readLW link

LeCun’s “A Path Towards Autonomous Machine Intelligence” has an unsolved technical alignment problem

Steven Byrnes8 May 2023 19:35 UTC

144 points

38 comments15 min readLW link

Product Endorsement: Apollo Neuro

Elizabeth8 May 2023 19:00 UTC

47 points

28 comments5 min readLW link

(acesounderglass.com)

Acausal trade naturally results in the Nash bargaining solution

Christopher King8 May 2023 18:13 UTC

3 points

0 comments4 min readLW link

Inference Speed is Not Unbounded

Onid8 May 2023 16:24 UTC

35 points

32 comments16 min readLW link

[Crosspost] Unveiling the American Public Opinion on AI Moratorium and Government Intervention: The Impact of Media Exposure

otto.barten8 May 2023 14:09 UTC

7 points

0 comments6 min readLW link

(forum.effectivealtruism.org)

Thriving in the Weird Times: Preparing for the 100X Economy

Lucie Philippon and Charbel-Raphaël

8 May 2023 13:44 UTC

23 points

16 comments2 min readLW link

Housing and Transit Roundup #4

Zvi8 May 2023 13:30 UTC

25 points

0 comments11 min readLW link

(thezvi.wordpress.com)

Dance Profit Sharing

jefftk8 May 2023 13:10 UTC

11 points

3 comments2 min readLW link

(www.jefftk.com)

How “AGI” could end up being many different specialized AI’s stitched together

titotal8 May 2023 12:32 UTC

9 points

2 comments9 min readLW link

What does it take to ban a thing?

qbolec8 May 2023 11:00 UTC

66 points

18 comments5 min readLW link

Solomonoff’s solipsism

Mergimio H. Doefevmil8 May 2023 6:55 UTC

−13 points

9 comments1 min readLW link

A technical note on bilinear layers for interpretability

Lee Sharkey8 May 2023 6:06 UTC

59 points

0 comments1 min readLW link

(arxiv.org)

[Question] Is EDT correct? Does “EDT” == “logical EDT” == “logical CDT”?

Vivek Hebbar8 May 2023 2:07 UTC

13 points

2 comments1 min readLW link

LLM cognition is probably not human-like

Max H8 May 2023 1:22 UTC

27 points

15 comments7 min readLW link

[Question] If alignment problem was unsolvable, would that avoid doom?

Kinrany7 May 2023 22:13 UTC

3 points

3 comments1 min readLW link