All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025

All Jan Feb Mar Apr MayJunJul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 161718 19 20 21 22 23 24 25 26 27 28 29 30

A transparency and interpretability tech tree

evhub16 Jun 2022 23:44 UTC

163 points

11 comments18 min readLW link 1 review

BBC Future covers progress studies

jasoncrawford16 Jun 2022 22:44 UTC

21 points

6 comments3 min readLW link

(rootsofprogress.org)

Humans are very reliable agents

alyssavance16 Jun 2022 22:02 UTC

270 points

35 comments3 min readLW link

Towards Gears-Level Understanding of Agency

Thane Ruthenis16 Jun 2022 22:00 UTC

25 points

4 comments18 min readLW link

A possible AI-inoculation due to early “robot uprising”

Shmi16 Jun 2022 21:21 UTC

16 points

2 comments1 min readLW link

AI Risk, as Seen on Snapchat

dkirmani16 Jun 2022 19:31 UTC

23 points

8 comments1 min readLW link

[Link] “The madness of reduced medical diagnostics” by Dynomight

Kenny16 Jun 2022 19:20 UTC

16 points

25 comments1 min readLW link

Breaking Down Goal-Directed Behaviour

Oliver Sourbut16 Jun 2022 18:45 UTC

11 points

1 comment2 min readLW link

Perils of optimizing in social contexts

owencb16 Jun 2022 17:40 UTC

50 points

1 comment2 min readLW link

Don’t Over-Optimize Things

owencb16 Jun 2022 16:33 UTC

27 points

6 comments4 min readLW link

[Question] Security analysis of ‘cloud chemistry labs’?

Kenny16 Jun 2022 16:06 UTC

6 points

2 comments1 min readLW link

Covid 6/16/22: Do Not Hand it to Them

Zvi16 Jun 2022 14:40 UTC

29 points

5 comments7 min readLW link

(thezvi.wordpress.com)

[Question] Is there a worked example of Georgian taxes?

Dagon16 Jun 2022 14:07 UTC

8 points

12 comments1 min readLW link

Against Active Shooter Drills

Zvi16 Jun 2022 13:40 UTC

91 points

30 comments7 min readLW link

(thezvi.wordpress.com)

Ten experiments in modularity, which we’d like you to run!

CallumMcDougall, Lucius Bushnaq and Avery

16 Jun 2022 9:17 UTC

62 points

3 comments9 min readLW link

[Question] What if LaMDA is indeed sentient / self-aware / worth having rights?

RomanS16 Jun 2022 9:10 UTC

22 points

12 comments1 min readLW link

Lifeguards

Orpheus1615 Jun 2022 23:03 UTC

12 points

3 comments2 min readLW link

(forum.effectivealtruism.org)

Rationality Vienna Hike

Laszlo_Treszkai15 Jun 2022 22:11 UTC

3 points

0 comments1 min readLW link

Contra Hofstadter on GPT-3 Nonsense

rictic15 Jun 2022 21:53 UTC

238 points

24 comments2 min readLW link

Progress links and tweets, 2022-06-13

jasoncrawford15 Jun 2022 19:47 UTC

12 points

0 comments1 min readLW link

(rootsofprogress.org)

I applied for a MIRI job in 2020. Here’s what happened next.

ViktoriaMalyasova15 Jun 2022 19:37 UTC

86 points

17 comments7 min readLW link

Contextual Evil

ACrackedPot15 Jun 2022 19:32 UTC

1 point

12 comments2 min readLW link

Multigate Priors

Adam Jermyn15 Jun 2022 19:30 UTC

4 points

0 comments3 min readLW link

FYI: I’m working on a book about the threat of AGI/ASI for a general audience. I hope it will be of value to the cause and the community

Darren McKee15 Jun 2022 18:08 UTC

43 points

15 comments2 min readLW link

[Question] What are all the AI Alignment and AI Safety Communication Hubs?

Gunnar_Zarncke15 Jun 2022 16:16 UTC

27 points

5 comments1 min readLW link

Georgism, in theory

Stuart_Armstrong15 Jun 2022 15:20 UTC

40 points

22 comments4 min readLW link

Berlin AI Safety Open Meetup June 2022

pranomostro15 Jun 2022 14:33 UTC

12 points

0 comments1 min readLW link

A central AI alignment problem: capabilities generalization, and the sharp left turn

So8res15 Jun 2022 13:10 UTC

275 points

55 comments10 min readLW link 1 review

Our mental building blocks are more different than I thought

Marius Hobbhahn15 Jun 2022 11:07 UTC

50 points

11 comments14 min readLW link

[Question] Has there been any work on attempting to use Pascal’s Mugging to make an AGI behave?

Chris_Leong15 Jun 2022 8:33 UTC

7 points

17 comments1 min readLW link

Alignment Risk Doesn’t Require Superintelligence

JustisMills15 Jun 2022 3:12 UTC

35 points

4 comments2 min readLW link

A Butterfly’s View of Probability

Gabriel Wu15 Jun 2022 2:14 UTC

29 points

17 comments11 min readLW link

[Question] Favourite new AI productivity tools?

Gabe M15 Jun 2022 1:08 UTC

14 points

6 comments1 min readLW link

Will vague “AI sentience” concerns do more for AI safety than anything else we might do?

Aryeh Englander14 Jun 2022 23:53 UTC

15 points

2 comments1 min readLW link

Yes, AI research will be substantially curtailed if a lab causes a major disaster

lc14 Jun 2022 22:17 UTC

104 points

31 comments2 min readLW link

Slow motion videos as AI risk intuition pumps

Andrew_Critch14 Jun 2022 19:31 UTC

242 points

41 comments2 min readLW link 1 review

Cryptographic Life: How to transcend in a sub-lightspeed world via Homomorphic encryption

Golol14 Jun 2022 19:22 UTC

1 point

0 comments3 min readLW link

Blake Richards on Why he is Skeptical of Existential Risk from AI

Michaël Trazzi14 Jun 2022 19:09 UTC

41 points

12 comments4 min readLW link

(theinsideview.ai)

[Question] How Do You Quantify [Physics Interfacing] Real World Capabilities?

DragonGod14 Jun 2022 14:49 UTC

17 points

1 comment4 min readLW link

Was the Industrial Revolution The Industrial Revolution?

Davis Kedrosky14 Jun 2022 14:48 UTC

29 points

0 comments12 min readLW link

(daviskedrosky.substack.com)

Investigating causal understanding in LLMs

Marius Hobbhahn and Tom Lieberum

14 Jun 2022 13:57 UTC

28 points

6 comments13 min readLW link

Why multi-agent safety is important

Akbir Khan14 Jun 2022 9:23 UTC

10 points

2 comments10 min readLW link

[Question] Was Eliezer Yudkowsky right to give himself 10% to succeed with HPMoR in 2010?

momom214 Jun 2022 7:00 UTC

2 points

2 comments1 min readLW link

Resources I send to AI researchers about AI safety

Vael Gates14 Jun 2022 2:24 UTC

69 points

12 comments1 min readLW link

Vael Gates: Risks from Advanced AI (June 2022)

Vael Gates14 Jun 2022 0:54 UTC

38 points

2 comments30 min readLW link

Cambridge LW Meetup: Personal Finance

Tony Wang14 Jun 2022 0:12 UTC

3 points

0 comments1 min readLW link

OpenAI: GPT-based LLMs show ability to discriminate between its own wrong answers, but inability to explain how/why it makes that discrimination, even as model scales

Aditya Jain13 Jun 2022 23:33 UTC

14 points

5 comments1 min readLW link

(openai.com)

[Question] Who said something like “The fact that putting 2 apples next to 2 other apples leads to there being 4 apples there has nothing to do with the fact that 2 + 2 = 4”?

hunterglenn13 Jun 2022 22:23 UTC

1 point

2 comments1 min readLW link

Continuity Assumptions

Jan_Kulveit13 Jun 2022 21:31 UTC

52 points

13 comments4 min readLW link

Crypto-fed Computation

aaguirre13 Jun 2022 21:20 UTC

24 points

7 comments7 min readLW link