All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar Apr May Jun Jul AugSepOct Nov Dec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

List of how people have become more hard-working

Chi NguyenSep 29, 2023, 11:30 AM

69 points

7 comments LW link

Contra Yudkowsky on Epistemic Conduct for Author Criticism

Zack_M_DavisSep 13, 2023, 3:33 PM

69 points

38 comments7 min readLW link

Can I take ducks home from the park?

dynomightSep 14, 2023, 9:03 PM

67 points

8 comments3 min readLW link

(dynomight.net)

[Link post] Michael Nielsen’s “Notes on Existential Risk from Artificial Superintelligence”

Joel BeckerSep 19, 2023, 1:31 PM

67 points

12 comments LW link

(michaelnotebook.com)

If influence functions are not approximating leave-one-out, how are they supposed to help?

Fabien RogerSep 22, 2023, 2:23 PM

66 points

5 comments3 min readLW link

Petrov Day Retrospective, 2023 (re: the most important virtue of Petrov Day & unilaterally promoting it)

RubySep 28, 2023, 2:48 AM

66 points

73 comments6 min readLW link

GPT-4 for personal productivity: online distraction blocker

SergiiSep 26, 2023, 5:41 PM

65 points

13 comments2 min readLW link

(grgv.xyz)

AI #29: Take a Deep Breath

ZviSep 14, 2023, 12:00 PM

65 points

21 comments21 min readLW link

(thezvi.wordpress.com)

a rant on politician-engineer coalitional conflict

bhauthSep 4, 2023, 5:15 PM

64 points

12 comments4 min readLW link

Understanding strategic deception and deceptive alignment

Marius Hobbhahn, Mikita Balesni, Jérémy Scheurer and Dan Braun

Sep 25, 2023, 4:27 PM

64 points

16 comments7 min readLW link

(www.apolloresearch.ai)

Interpretability Externalities Case Study—Hungry Hungry Hippos

Magdalena WacheSep 20, 2023, 2:42 PM

64 points

22 comments2 min readLW link

Eugenics Performed By A Blind, Idiot God

omnizoidSep 17, 2023, 8:37 PM

63 points

11 comments2 min readLW link

Instrumental Convergence Bounty

Logan ZoellnerSep 14, 2023, 2:02 PM

62 points

24 comments1 min readLW link

Linkpost for Jan Leike on Self-Exfiltration

Daniel KokotajloSep 13, 2023, 9:23 PM

59 points

1 comment2 min readLW link

(aligned.substack.com)

Image Hijacks: Adversarial Images can Control Generative Models at Runtime

Scott Emmons, Luke Bailey and Euan Ong

Sep 20, 2023, 3:23 PM

58 points

9 comments1 min readLW link

(arxiv.org)

Bids To Defer On Value Judgements

johnswentworthSep 29, 2023, 5:07 PM

58 points

6 comments3 min readLW link

Protest against Meta’s irreversible proliferation (Sept 29, San Francisco)

Holly_ElmoreSep 19, 2023, 11:40 PM

54 points

33 comments LW link

Some reasons why I frequently prefer communicating via text

Adam ZernerSep 18, 2023, 9:50 PM

53 points

18 comments2 min readLW link

AI#28: Watching and Waiting

ZviSep 7, 2023, 5:20 PM

52 points

14 comments45 min readLW link

(thezvi.wordpress.com)

Who Has the Best Food?

ZviSep 5, 2023, 1:40 PM

52 points

61 comments10 min readLW link

(thezvi.wordpress.com)

The point of a game is not to win, and you shouldn’t even pretend that it is

mako yassSep 28, 2023, 3:54 PM

51 points

27 comments4 min readLW link

(makopool.com)

Is AI Safety dropping the ball on privacy?

markovSep 13, 2023, 1:07 PM

50 points

17 comments7 min readLW link

Basic Mathematics of Predictive Coding

Adam ShaiSep 29, 2023, 2:38 PM

49 points

6 comments9 min readLW link

Competitive, Cooperative, and Cohabitive

ScrewtapeSep 28, 2023, 11:25 PM

49 points

13 comments5 min readLW link 1 review

Fund Transit With Development

jefftk22 Sep 2023 11:10 UTC

47 points

22 comments3 min readLW link

(www.jefftk.com)

Three ways interpretability could be impactful

Arthur Conmy18 Sep 2023 1:02 UTC

47 points

8 comments4 min readLW link

Immortality or death by AGI

ImmortalityOrDeathByAGI21 Sep 2023 23:59 UTC

47 points

30 comments4 min readLW link

(forum.effectivealtruism.org)

Telopheme, telophore, and telotect

TsviBT17 Sep 2023 16:24 UTC

46 points

7 comments8 min readLW link

The goal of physics

Jim Pivarski2 Sep 2023 23:08 UTC

46 points

4 comments5 min readLW link

Feedback-loops, Deliberate Practice, and Transfer Learning

Bird Concept and Raemon

7 Sep 2023 1:57 UTC

46 points

5 comments1 min readLW link

[Question] Where might I direct promising-to-me researchers to apply for alignment jobs/grants?

abramdemski18 Sep 2023 16:20 UTC

45 points

10 comments1 min readLW link

Jacob on the Precipice

Richard_Ngo26 Sep 2023 21:16 UTC

45 points

8 comments11 min readLW link

(narrativeark.substack.com)

Amazon to invest up to $4 billion in Anthropic

Davis_Kingsley25 Sep 2023 14:55 UTC

44 points

8 comments LW link

(twitter.com)

Commonsense Good, Creative Good

jefftk27 Sep 2023 19:50 UTC

44 points

11 comments3 min readLW link

(www.jefftk.com)

Recreating the caring drive

Catnee7 Sep 2023 10:41 UTC

43 points

15 comments10 min readLW link 1 review

Sparse Coding, for Mechanistic Interpretability and Activation Engineering

David Udell23 Sep 2023 19:16 UTC

42 points

7 comments34 min readLW link

Focus on the Hardest Part First

Johannes C. Mayer11 Sep 2023 7:53 UTC

42 points

13 comments1 min readLW link

Deconfusing Regret

Alex Hollow15 Sep 2023 11:52 UTC

41 points

32 comments2 min readLW link

Technical AI Safety Research Landscape [Slides]

Magdalena Wache18 Sep 2023 13:56 UTC

41 points

0 comments4 min readLW link

What is the optimal frontier for due diligence?

RobertM and Ruby

8 Sep 2023 18:20 UTC

41 points

1 comment1 min readLW link

[Question] Strongest real-world examples supporting AI risk claims?

rosehadshar5 Sep 2023 15:12 UTC

41 points

7 comments1 min readLW link

ARC Evals: Responsible Scaling Policies

Zach Stein-Perlman28 Sep 2023 4:30 UTC

40 points

10 comments2 min readLW link 1 review

(evals.alignment.org)

Reflexive decision theory is an unsolved problem

Richard_Kennaway17 Sep 2023 14:15 UTC

40 points

27 comments4 min readLW link

Luck based medicine: inositol for anxiety and brain fog

Elizabeth22 Sep 2023 20:10 UTC

40 points

5 comments3 min readLW link

(acesounderglass.com)

Debate series: should we push for a pause on the development of AI?

Xodarap8 Sep 2023 16:29 UTC

39 points

1 comment LW link

Startup Roundup #1: Happy Demo Day

Zvi12 Sep 2023 13:20 UTC

38 points

5 comments15 min readLW link

(thezvi.wordpress.com)

I designed an AI safety course (for a philosophy department)

Eleni Angelou23 Sep 2023 22:03 UTC

37 points

15 comments2 min readLW link

A Theory of Laughter—Follow-Up

Steven Byrnes14 Sep 2023 15:35 UTC

37 points

3 comments8 min readLW link

Actually, “personal attacks after object-level arguments” is a pretty good rule of epistemic conduct

Max H17 Sep 2023 20:25 UTC

37 points

15 comments7 min readLW link

Alignment Workshop talks

Richard_Ngo28 Sep 2023 18:26 UTC

37 points

1 comment1 min readLW link

(www.alignment-workshop.com)