All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All JanFebMar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 141516 17 18 19 20 21 22 23 24 25 26 27 28

Whole Bird Emulation requires Quantum Mechanics

Jeffrey Heninger14 Feb 2023 23:50 UTC

25 points

9 comments3 min readLW link

(aiimpacts.org)

Qualities that alignment mentors value in junior researchers

Orpheus1614 Feb 2023 23:27 UTC

88 points

14 comments3 min readLW link

Help Update TryContra

jefftk14 Feb 2023 19:10 UTC

12 points

0 comments1 min readLW link

(www.jefftk.com)

Content Features Aren’t Enough for Detecting Toxicity. One Needs User Features.

Zachary Witten14 Feb 2023 18:48 UTC

11 points

0 comments3 min readLW link

EIS III: Broad Critiques of Interpretability Research

scasper14 Feb 2023 18:24 UTC

20 points

2 comments11 min readLW link

[Question] What would an AI need to bootstrap recursively self improving robots?

Yair Halberstadt14 Feb 2023 17:58 UTC

3 points

5 comments1 min readLW link

[linkpost] Better Without AI

DanielFilan14 Feb 2023 17:30 UTC

48 points

13 comments1 min readLW link

(betterwithout.ai)

The Cave Allegory Revisited: Understanding GPT’s Worldview

Jan_Kulveit14 Feb 2023 16:00 UTC

89 points

5 comments3 min readLW link

[Question] Why should we expect AIs to coordinate well?

Jonathan Paulson14 Feb 2023 15:50 UTC

25 points

9 comments1 min readLW link

Explaining SolidGoldMagikarp by looking at it from random directions

Robert_AIZI14 Feb 2023 14:54 UTC

8 points

0 comments8 min readLW link

(aizi.substack.com)

Reverse-correlation: how to summon the ghost of your mental imagery

Malmesbury14 Feb 2023 14:15 UTC

44 points

0 comments5 min readLW link

Evaluating 2022 ACX Predictions

Zvi14 Feb 2023 12:20 UTC

20 points

3 comments23 min readLW link

(thezvi.wordpress.com)

SolidGoldMagikarp III: Glitch token archaeology

mwatkins and Jessica Rumbelow

14 Feb 2023 10:17 UTC

92 points

36 comments16 min readLW link

The Linguistic Blind Spot of Value-Aligned Agency, Natural and Artificial

Roman Leventov14 Feb 2023 6:57 UTC

6 points

0 comments2 min readLW link

(arxiv.org)

Conceptual Pathfinding

DirectedEvolution14 Feb 2023 5:49 UTC

18 points

6 comments3 min readLW link

Important fact about how people evaluate sets of arguments

Daniel Kokotajlo14 Feb 2023 5:27 UTC

33 points

10 comments2 min readLW link

[Question] How much is death a limit on knowledge accumulation?

Gordon Seidoh Worley14 Feb 2023 3:54 UTC

31 points

9 comments2 min readLW link

The Filan Cabinet Podcast with Oliver Habryka—Transcript

MondSemmel and RobertM

14 Feb 2023 2:38 UTC

104 points

9 comments72 min readLW link

[Question] Is InstructGPT Following Instructions in Other Languages Surprising?

DragonGod13 Feb 2023 23:26 UTC

39 points

15 comments1 min readLW link

LLM Basics: Embedding Spaces—Transformer Token Vectors Are Not Points in Space

NickyP13 Feb 2023 18:52 UTC

85 points

11 comments15 min readLW link

4 ways to think about democratizing AI [GovAI Linkpost]

Orpheus1613 Feb 2023 18:06 UTC

24 points

4 comments1 min readLW link

(www.governance.ai)

Does the AGPL Work?

jefftk13 Feb 2023 14:20 UTC

13 points

12 comments2 min readLW link

(www.jefftk.com)

H5N1

Zvi13 Feb 2023 12:50 UTC

102 points

1 comment9 min readLW link

(thezvi.wordpress.com)

Enjoy LessWrong in ebook format

Bart Bussmann13 Feb 2023 11:53 UTC

54 points

3 comments1 min readLW link

Morphological intelligence, superhuman empathy, and ethical arbitration

Roman Leventov13 Feb 2023 10:25 UTC

1 point

0 comments2 min readLW link

South Bay ACX/LW Meetup

IS13 Feb 2023 6:08 UTC

3 points

0 comments1 min readLW link

Idea: Network modularity and interpretability by sexual reproduction

qbolec12 Feb 2023 23:06 UTC

3 points

3 comments1 min readLW link

The End of Anonymity Online

Spiorad12 Feb 2023 21:23 UTC

3 points

9 comments2 min readLW link

Matt Clancy AMA on the Progress Forum

jasoncrawford12 Feb 2023 20:23 UTC

17 points

0 comments1 min readLW link

(progressforum.org)

Latent variables for prediction markets: motivation, technical guide, and design considerations

tailcalled12 Feb 2023 17:54 UTC

103 points

25 comments23 min readLW link 2 reviews

The conceptual Doppelgänger problem

TsviBT12 Feb 2023 17:23 UTC

19 points

5 comments4 min readLW link

How Cardioid Are Cardioids?

jefftk12 Feb 2023 16:20 UTC

9 points

0 comments2 min readLW link

(www.jefftk.com)

How many of these jobs will have a 15% or more drop in employment plausibly attributable to AI by 2031?

tailcalled12 Feb 2023 15:40 UTC

12 points

5 comments1 min readLW link

(manifold.markets)

Human-AI collaborative writing

DirectedEvolution12 Feb 2023 14:57 UTC

20 points

2 comments5 min readLW link

RaD-AI workshop

Ram Rachum12 Feb 2023 12:46 UTC

3 points

0 comments1 min readLW link

Elements of Rationalist Discourse

Rob Bensinger12 Feb 2023 7:58 UTC

226 points

49 comments3 min readLW link 1 review

Conflict Theory of Bounded Distrust

Zack_M_Davis12 Feb 2023 5:30 UTC

112 points

33 comments3 min readLW link 1 review

Why almost every RL agent does learned optimization

Lee Sharkey12 Feb 2023 4:58 UTC

32 points

3 comments5 min readLW link

How I Learn From Textbooks

DirectedEvolution12 Feb 2023 4:45 UTC

26 points

3 comments8 min readLW link

Top YouTube channel Veritasium releases video on Sleeping Beauty Problem

Alex_Altair11 Feb 2023 20:36 UTC

25 points

22 comments1 min readLW link

(www.youtube.com)

Shortening Timelines: There’s No Buffer Anymore

Jeff Rose11 Feb 2023 19:53 UTC

10 points

5 comments1 min readLW link

We Found An Neuron in GPT-2

Joseph Miller and Clement Neo

11 Feb 2023 18:27 UTC

143 points

23 comments7 min readLW link

(clementneo.com)

The Practitioner’s Path 2.0: the Pragmatist Archetype

Evenflair11 Feb 2023 15:48 UTC

21 points

0 comments2 min readLW link

(guildoftherose.org)

The Illusion of Simplicity: Monetary Policy as a Problem of Complexity and Alignment

Edward P. Könings11 Feb 2023 15:04 UTC

8 points

0 comments8 min readLW link

(edwardknings.substack.com)

In Defense of Chatbot Romance

Kaj_Sotala11 Feb 2023 14:30 UTC

127 points

53 comments11 min readLW link

(kajsotala.fi)

Threatening to do the impossible: A solution to spurious counterfactuals for functional decision theory via proof theory

Christopher King11 Feb 2023 7:57 UTC

5 points

4 comments5 min readLW link

Rationality-related things I don’t know as of 2023

Biff Wiff11 Feb 2023 6:04 UTC

64 points

59 comments3 min readLW link

A note on ‘semiotic physics’

metasemi11 Feb 2023 5:12 UTC

11 points

13 comments6 min readLW link

Inequality Penalty: Morality in Many Worlds

Shmi11 Feb 2023 4:08 UTC

11 points

17 comments6 min readLW link

The Importance of AI Alignment, explained in 5 points

Daniel_Eth11 Feb 2023 2:56 UTC

33 points

2 comments13 min readLW link