All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All Jan Feb Mar AprMayJun Jul Aug Sep Oct Nov Dec

All 1 2 345 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Prizes for matrix completion problems

paulfchristiano3 May 2023 23:30 UTC

164 points

52 comments1 min readLW link

(www.alignment.org)

Alignment Research @ EleutherAI

Curtis Huebner3 May 2023 22:45 UTC

40 points

1 comment3 min readLW link

(blog.eleuther.ai)

«Boundaries/Membranes» and AI safety compilation

Chris Lakin3 May 2023 21:41 UTC

56 points

17 comments8 min readLW link

[Question] What constraints does deep learning place on alignment plans?

Garrett Baker3 May 2023 20:40 UTC

9 points

0 comments1 min readLW link

AGI rising: why we are in a new era of acute risk and increasing public awareness, and what to do now

Greg C3 May 2023 20:26 UTC

25 points

13 comments13 min readLW link

Formalizing the “AI x-risk is unlikely because it is ridiculous” argument

Christopher King3 May 2023 18:56 UTC

48 points

17 comments3 min readLW link

[Question] List of notable people who believe in AI X-risk?

vlad.proex3 May 2023 18:46 UTC

14 points

4 comments1 min readLW link

[Question] LessWrong exporting?

axiomAdministrator3 May 2023 18:34 UTC

0 points

3 comments1 min readLW link

Progress links and tweets, 2023-05-03

jasoncrawford3 May 2023 16:23 UTC

13 points

0 comments2 min readLW link

(rootsofprogress.org)

Personhood is a Religious Belief

jan Sijan3 May 2023 16:16 UTC

−41 points

28 comments6 min readLW link

Slowing AI: Crunch time

Zach Stein-Perlman3 May 2023 15:00 UTC

11 points

1 comment2 min readLW link

Finding Neurons in a Haystack: Case Studies with Sparse Probing

wesg and Neel Nanda

3 May 2023 13:30 UTC

33 points

6 comments2 min readLW link 1 review

(arxiv.org)

Monthly Roundup #6: May 2023

Zvi3 May 2023 12:50 UTC

31 points

12 comments24 min readLW link

(thezvi.wordpress.com)

[Question] How much do personal biases in risk assessment affect assessment of AI risks?

Gordon Seidoh Worley3 May 2023 6:12 UTC

10 points

8 comments1 min readLW link

Communication strategies for autism, with examples

stonefly3 May 2023 5:25 UTC

16 points

2 comments7 min readLW link

Understand how other people think: a theory of worldviews.

spencerg3 May 2023 3:57 UTC

2 points

8 comments5 min readLW link

“Copilot” type AI integration could lead to training data needed for AGI

anithite3 May 2023 0:57 UTC

8 points

0 comments2 min readLW link

Averting Catastrophe: Decision Theory for COVID-19, Climate Change, and Potential Disasters of All Kinds

JakubK2 May 2023 22:50 UTC

10 points

0 comments1 min readLW link

(nyupress.org)

A Case for the Least Forgiving Take On Alignment

Thane Ruthenis2 May 2023 21:34 UTC

101 points

85 comments22 min readLW link

Are Emergent Abilities of Large Language Models a Mirage? [linkpost]

Matthew Barnett2 May 2023 21:01 UTC

53 points

21 comments1 min readLW link

(arxiv.org)

Does descaling a kettle help? Theory and practice

philh2 May 2023 20:20 UTC

35 points

28 comments8 min readLW link

(reasonableapproximation.net)

Avoiding xrisk from AI doesn’t mean focusing on AI xrisk

Stuart_Armstrong2 May 2023 19:27 UTC

67 points

7 comments3 min readLW link

AI Safety Newsletter #4: AI and Cybersecurity, Persuasive AIs, Weaponization, and Geoffrey Hinton talks AI risks

ozhang, Dan H and Orpheus16

2 May 2023 18:41 UTC

32 points

0 comments5 min readLW link

(newsletter.safe.ai)

My best system yet: text-based project management

jt2 May 2023 17:44 UTC

6 points

8 comments5 min readLW link

[Question] What’s the state of AI safety in Japan?

ChristianKl2 May 2023 17:06 UTC

5 points

1 comment1 min readLW link

Five Worlds of AI (by Scott Aaronson and Boaz Barak)

mishka2 May 2023 13:23 UTC

22 points

6 comments1 min readLW link 1 review

(scottaaronson.blog)

Systems that cannot be unsafe cannot be safe

Davidmanheim2 May 2023 8:53 UTC

63 points

27 comments2 min readLW link

AGI safety career advice

Richard_Ngo2 May 2023 7:36 UTC

136 points

24 comments13 min readLW link

An Impossibility Proof Relevant to the Shutdown Problem and Corrigibility

Audere2 May 2023 6:52 UTC

66 points

13 comments9 min readLW link

Some Thoughts on Virtue Ethics for AIs

peligrietzer2 May 2023 5:46 UTC

88 points

8 comments4 min readLW link

Technological unemployment as another test for rationalist winning

RomanHauksson2 May 2023 4:16 UTC

14 points

5 comments1 min readLW link

The Moral Copernican Principle

Legionnaire2 May 2023 3:25 UTC

5 points

7 comments2 min readLW link

Open & Welcome Thread—May 2023

Ruby2 May 2023 2:58 UTC

22 points

41 comments1 min readLW link

Summaries of top forum posts (24th − 30th April 2023)

Zoe Williams2 May 2023 2:30 UTC

12 points

1 comment10 min readLW link

AXRP Episode 21 - Interpretability for Engineers with Stephen Casper

DanielFilan2 May 2023 0:50 UTC

12 points

1 comment66 min readLW link

Getting Your Eyes On

LoganStrohl2 May 2023 0:33 UTC

65 points

11 comments14 min readLW link

What 2025 looks like

Ruby1 May 2023 22:53 UTC

75 points

17 comments15 min readLW link

[Question] Natural Selection vs Gradient Descent

CuriousApe111 May 2023 22:16 UTC

4 points

3 comments1 min readLW link

A[I] Zombie Apocalypse Is Already Upon Us

NickHarris1 May 2023 22:02 UTC

−6 points

4 comments2 min readLW link

Geoff Hinton Quits Google

Adam Shai1 May 2023 21:03 UTC

98 points

14 comments1 min readLW link

The Apprentice Thread 2

hath1 May 2023 20:09 UTC

50 points

19 comments1 min readLW link

Budapest, Hungary – ACX Meetups Everywhere Spring 2023

Richard Horvath, Timothy Underwood and marta_k

1 May 2023 17:36 UTC

4 points

0 comments1 min readLW link

In favor of steelmanning

jp1 May 2023 17:12 UTC

36 points

6 comments3 min readLW link

Shah (DeepMind) and Leahy (Conjecture) Discuss Alignment Cruxes

Olive Branch, Rohin Shah, Connor Leahy and Andrea_Miotti

1 May 2023 16:47 UTC

96 points

10 comments30 min readLW link

Distinguishing misuse is difficult and uncomfortable

lemonhope1 May 2023 16:23 UTC

17 points

3 comments1 min readLW link

[Question] Does agency necessarily imply self-preservation instinct?

Mislav Jurić1 May 2023 16:06 UTC

5 points

8 comments1 min readLW link

What Boston Can Teach Us About What a Woman Is

ymeskhout1 May 2023 15:34 UTC

18 points

45 comments12 min readLW link

The Rocket Alignment Problem, Part 2

Zvi1 May 2023 14:30 UTC

40 points

20 comments9 min readLW link

(thezvi.wordpress.com)

Socialist Democratic-Republic GAME: 12 Amendments to the Constitutions of the Free World

monkymind1 May 2023 13:13 UTC

−34 points

0 comments1 min readLW link

[Question] Where is all this evidence of UFOs?

Logan Zoellner1 May 2023 12:13 UTC

29 points

42 comments1 min readLW link