All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar AprMayJun Jul Aug Sep Oct Nov Dec

All 1 2 3 456 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

A brief collection of Hinton’s recent comments on AGI risk

Kaj_Sotala4 May 2023 23:31 UTC

148 points

9 comments11 min readLW link

Robin Hanson and I talk about AI risk

KatjaGrace4 May 2023 22:20 UTC

39 points

8 comments1 min readLW link

(worldspiritsockpuppet.com)

Who regulates the regulators? We need to go beyond the review-and-approval paradigm

jasoncrawford4 May 2023 22:11 UTC

122 points

29 comments13 min readLW link

(rootsofprogress.org)

Recursive Middle Manager Hell: AI Edition

VojtaKovarik4 May 2023 20:08 UTC

31 points

11 comments2 min readLW link

AI risk/reward: A simple model

Nathan Young4 May 2023 19:25 UTC

3 points

0 comments7 min readLW link

Google “We Have No Moat, And Neither Does OpenAI”

Chris_Leong4 May 2023 18:23 UTC

61 points

28 comments1 min readLW link

(www.semianalysis.com)

Trying to measure AI deception capabilities using temporary simulation fine-tuning

alenoach4 May 2023 17:59 UTC

4 points

0 comments7 min readLW link

[Linkpost]Transformer-Based LM Surprisal Predicts Human Reading Times Best with About Two Billion Training Tokens

Curtis Huebner4 May 2023 17:16 UTC

10 points

1 comment1 min readLW link

(arxiv.org)

Clarifying and predicting AGI

Richard_Ngo4 May 2023 15:55 UTC

142 points

45 comments4 min readLW link

[Crosspost] AI X-risk in the News: How Effective are Recent Media Items and How is Awareness Changing? Our New Survey Results.

otto.barten4 May 2023 14:09 UTC

5 points

0 comments9 min readLW link

(forum.effectivealtruism.org)

AI #10: Code Interpreter and Geoff Hinton

Zvi4 May 2023 14:00 UTC

80 points

7 comments78 min readLW link

(thezvi.wordpress.com)

Advice for interacting with busy people

Severin T. Seehrich4 May 2023 13:31 UTC

68 points

4 comments4 min readLW link

We don’t need AGI for an amazing future

Karl von Wendt4 May 2023 12:10 UTC

20 points

32 comments5 min readLW link

Has the Symbol Grounding Problem just gone away?

RussellThor4 May 2023 7:46 UTC

12 points

3 comments1 min readLW link

Opinion merging for AI control

David Johnston4 May 2023 2:43 UTC

6 points

0 comments11 min readLW link

Understanding why illusionism does not deny the existence of qualia

Mergimio H. Doefevmil4 May 2023 2:13 UTC

0 points

17 comments1 min readLW link

[New] Rejected Content Section

Ruby and Raemon

4 May 2023 1:43 UTC

65 points

21 comments5 min readLW link

How MATS addresses “mass movement building” concerns

Ryan Kidd4 May 2023 0:55 UTC

63 points

9 comments3 min readLW link

Moving VPS Again

jefftk4 May 2023 0:30 UTC

9 points

2 comments1 min readLW link

(www.jefftk.com)

Prizes for matrix completion problems

paulfchristiano3 May 2023 23:30 UTC

164 points

52 comments1 min readLW link

(www.alignment.org)

Alignment Research @ EleutherAI

Curtis Huebner3 May 2023 22:45 UTC

40 points

1 comment3 min readLW link

(blog.eleuther.ai)

«Boundaries/Membranes» and AI safety compilation

Chris Lakin3 May 2023 21:41 UTC

56 points

17 comments8 min readLW link

[Question] What constraints does deep learning place on alignment plans?

Garrett Baker3 May 2023 20:40 UTC

9 points

0 comments1 min readLW link

AGI rising: why we are in a new era of acute risk and increasing public awareness, and what to do now

Greg C3 May 2023 20:26 UTC

25 points

12 comments13 min readLW link

Formalizing the “AI x-risk is unlikely because it is ridiculous” argument

Christopher King3 May 2023 18:56 UTC

48 points

17 comments3 min readLW link

[Question] List of notable people who believe in AI X-risk?

vlad.proex3 May 2023 18:46 UTC

14 points

4 comments1 min readLW link

[Question] LessWrong exporting?

axiomAdministrator3 May 2023 18:34 UTC

0 points

3 comments1 min readLW link

Progress links and tweets, 2023-05-03

jasoncrawford3 May 2023 16:23 UTC

13 points

0 comments2 min readLW link

(rootsofprogress.org)

Personhood is a Religious Belief

jan Sijan3 May 2023 16:16 UTC

−41 points

28 comments6 min readLW link

Slowing AI: Crunch time

Zach Stein-Perlman3 May 2023 15:00 UTC

11 points

1 comment2 min readLW link

Finding Neurons in a Haystack: Case Studies with Sparse Probing

wesg and Neel Nanda

3 May 2023 13:30 UTC

33 points

6 comments2 min readLW link 1 review

(arxiv.org)

Monthly Roundup #6: May 2023

Zvi3 May 2023 12:50 UTC

31 points

12 comments24 min readLW link

(thezvi.wordpress.com)

[Question] How much do personal biases in risk assessment affect assessment of AI risks?

Gordon Seidoh Worley3 May 2023 6:12 UTC

10 points

8 comments1 min readLW link

Communication strategies for autism, with examples

stonefly3 May 2023 5:25 UTC

16 points

2 comments7 min readLW link

Understand how other people think: a theory of worldviews.

spencerg3 May 2023 3:57 UTC

2 points

8 comments5 min readLW link

“Copilot” type AI integration could lead to training data needed for AGI

anithite3 May 2023 0:57 UTC

8 points

0 comments2 min readLW link

Averting Catastrophe: Decision Theory for COVID-19, Climate Change, and Potential Disasters of All Kinds

JakubK2 May 2023 22:50 UTC

10 points

0 comments1 min readLW link

(nyupress.org)

A Case for the Least Forgiving Take On Alignment

Thane Ruthenis2 May 2023 21:34 UTC

100 points

85 comments22 min readLW link

Are Emergent Abilities of Large Language Models a Mirage? [linkpost]

Matthew Barnett2 May 2023 21:01 UTC

53 points

21 comments1 min readLW link

(arxiv.org)

Does descaling a kettle help? Theory and practice

philh2 May 2023 20:20 UTC

35 points

25 comments8 min readLW link

(reasonableapproximation.net)

Avoiding xrisk from AI doesn’t mean focusing on AI xrisk

Stuart_Armstrong2 May 2023 19:27 UTC

67 points

7 comments3 min readLW link

AI Safety Newsletter #4: AI and Cybersecurity, Persuasive AIs, Weaponization, and Geoffrey Hinton talks AI risks

ozhang, Dan H and Orpheus16

2 May 2023 18:41 UTC

32 points

0 comments5 min readLW link

(newsletter.safe.ai)

My best system yet: text-based project management

jt2 May 2023 17:44 UTC

6 points

8 comments5 min readLW link

[Question] What’s the state of AI safety in Japan?

ChristianKl2 May 2023 17:06 UTC

5 points

1 comment1 min readLW link

Five Worlds of AI (by Scott Aaronson and Boaz Barak)

mishka2 May 2023 13:23 UTC

22 points

6 comments1 min readLW link 1 review

(scottaaronson.blog)

Systems that cannot be unsafe cannot be safe

Davidmanheim2 May 2023 8:53 UTC

62 points

27 comments2 min readLW link

AGI safety career advice

Richard_Ngo2 May 2023 7:36 UTC

132 points

24 comments13 min readLW link

An Impossibility Proof Relevant to the Shutdown Problem and Corrigibility

Audere2 May 2023 6:52 UTC

66 points

13 comments9 min readLW link

Some Thoughts on Virtue Ethics for AIs

peligrietzer2 May 2023 5:46 UTC

83 points

8 comments4 min readLW link

Technological unemployment as another test for rationalist winning

RomanHauksson2 May 2023 4:16 UTC

14 points

5 comments1 min readLW link