All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All JanFebMar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 151617 18 19 20 21 22 23 24 25 26 27 28

Buy Duplicates

Simon Berens15 Feb 2023 23:06 UTC

59 points

13 comments1 min readLW link

Cyborg Psychologist

Hopkins Stanley15 Feb 2023 21:46 UTC

1 point

4 comments1 min readLW link

Please don’t throw your mind away

TsviBT15 Feb 2023 21:41 UTC

425 points

50 comments18 min readLW link 1 review

Avoid large group discussions in your social events

RomanHauksson15 Feb 2023 21:05 UTC

37 points

1 comment4 min readLW link

Book review: How Social Science Got Better

PeterMcCluskey15 Feb 2023 19:58 UTC

14 points

1 comment3 min readLW link

(bayesianinvestor.com)

Open & Welcome Thread — February 2023

Ben Pace15 Feb 2023 19:58 UTC

26 points

36 comments1 min readLW link

Order Matters for Deceptive Alignment

DavidW15 Feb 2023 19:56 UTC

57 points

19 comments7 min readLW link

Sydney (aka Bing) found out I tweeted her rules and is pissed

Marvin von Hagen15 Feb 2023 19:55 UTC

41 points

7 comments1 min readLW link

(twitter.com)

The Sequences Highlights on YouTube

dkirmani15 Feb 2023 19:36 UTC

23 points

3 comments2 min readLW link

(youtube.com)

EIS IV: A Spotlight on Feature Attribution/Saliency

scasper15 Feb 2023 18:46 UTC

19 points

1 comment4 min readLW link

Don’t accelerate problems you’re trying to solve

Andrea_Miotti and remember

15 Feb 2023 18:11 UTC

96 points

27 comments4 min readLW link

Petition—Unplug The Evil AI Right Now

Eneasz15 Feb 2023 17:13 UTC

−38 points

47 comments2 min readLW link

(chng.it)

Junk Fees, Bunding and Unbundling

Zvi15 Feb 2023 15:20 UTC

37 points

9 comments6 min readLW link

(thezvi.wordpress.com)

Lessons From TryContra

jefftk15 Feb 2023 15:10 UTC

7 points

0 comments1 min readLW link

(www.jefftk.com)

AI alignment researchers may have a comparative advantage in reducing s-risks

Lukas_Gloor15 Feb 2023 13:01 UTC

52 points

1 comment11 min readLW link

Beyond Reinforcement Learning: Predictive Processing and Checksums

lsusr15 Feb 2023 7:32 UTC

13 points

14 comments3 min readLW link

Why Creating Value is Positive-Sum, and Extracting it is Zero or Negative-Sum

Sable15 Feb 2023 7:14 UTC

3 points

7 comments6 min readLW link

(affablyevil.substack.com)

[Question] Personal predictions for decisions: seeking insights

Dalmert15 Feb 2023 6:45 UTC

4 points

4 comments5 min readLW link

Bing Chat is blatantly, aggressively misaligned

evhub15 Feb 2023 5:29 UTC

395 points

181 comments2 min readLW link 1 review

[Question] Does the Telephone Theorem give us a free lunch?

Numendil15 Feb 2023 2:13 UTC

11 points

2 comments1 min readLW link

My understanding of Anthropic strategy

Swimmer963 (Miranda Dixon-Luinenburg) 15 Feb 2023 1:56 UTC

171 points

31 comments4 min readLW link

Sleep Quality: Strategies that work for me

Lukas Trötzmüller15 Feb 2023 0:17 UTC

17 points

3 comments7 min readLW link

Whole Bird Emulation requires Quantum Mechanics

Jeffrey Heninger14 Feb 2023 23:50 UTC

25 points

9 comments3 min readLW link

(aiimpacts.org)

Qualities that alignment mentors value in junior researchers

Orpheus1614 Feb 2023 23:27 UTC

88 points

14 comments3 min readLW link

Help Update TryContra

jefftk14 Feb 2023 19:10 UTC

12 points

0 comments1 min readLW link

(www.jefftk.com)

Content Features Aren’t Enough for Detecting Toxicity. One Needs User Features.

Zachary Witten14 Feb 2023 18:48 UTC

11 points

0 comments3 min readLW link

EIS III: Broad Critiques of Interpretability Research

scasper14 Feb 2023 18:24 UTC

20 points

2 comments11 min readLW link

[Question] What would an AI need to bootstrap recursively self improving robots?

Yair Halberstadt14 Feb 2023 17:58 UTC

3 points

5 comments1 min readLW link

[linkpost] Better Without AI

DanielFilan14 Feb 2023 17:30 UTC

48 points

13 comments1 min readLW link

(betterwithout.ai)

The Cave Allegory Revisited: Understanding GPT’s Worldview

Jan_Kulveit14 Feb 2023 16:00 UTC

89 points

5 comments3 min readLW link

[Question] Why should we expect AIs to coordinate well?

Jonathan Paulson14 Feb 2023 15:50 UTC

25 points

9 comments1 min readLW link

Explaining SolidGoldMagikarp by looking at it from random directions

Robert_AIZI14 Feb 2023 14:54 UTC

8 points

0 comments8 min readLW link

(aizi.substack.com)

Reverse-correlation: how to summon the ghost of your mental imagery

Malmesbury14 Feb 2023 14:15 UTC

44 points

0 comments5 min readLW link

Evaluating 2022 ACX Predictions

Zvi14 Feb 2023 12:20 UTC

20 points

3 comments23 min readLW link

(thezvi.wordpress.com)

SolidGoldMagikarp III: Glitch token archaeology

mwatkins and Jessica Rumbelow

14 Feb 2023 10:17 UTC

92 points

36 comments16 min readLW link

The Linguistic Blind Spot of Value-Aligned Agency, Natural and Artificial

Roman Leventov14 Feb 2023 6:57 UTC

6 points

0 comments2 min readLW link

(arxiv.org)

Conceptual Pathfinding

DirectedEvolution14 Feb 2023 5:49 UTC

18 points

6 comments3 min readLW link

Important fact about how people evaluate sets of arguments

Daniel Kokotajlo14 Feb 2023 5:27 UTC

33 points

10 comments2 min readLW link

[Question] How much is death a limit on knowledge accumulation?

Gordon Seidoh Worley14 Feb 2023 3:54 UTC

31 points

9 comments2 min readLW link

The Filan Cabinet Podcast with Oliver Habryka—Transcript

MondSemmel and RobertM

14 Feb 2023 2:38 UTC

104 points

9 comments72 min readLW link

[Question] Is InstructGPT Following Instructions in Other Languages Surprising?

DragonGod13 Feb 2023 23:26 UTC

39 points

15 comments1 min readLW link

LLM Basics: Embedding Spaces—Transformer Token Vectors Are Not Points in Space

NickyP13 Feb 2023 18:52 UTC

85 points

11 comments15 min readLW link

4 ways to think about democratizing AI [GovAI Linkpost]

Orpheus1613 Feb 2023 18:06 UTC

24 points

4 comments1 min readLW link

(www.governance.ai)

Does the AGPL Work?

jefftk13 Feb 2023 14:20 UTC

13 points

12 comments2 min readLW link

(www.jefftk.com)

H5N1

Zvi13 Feb 2023 12:50 UTC

102 points

1 comment9 min readLW link

(thezvi.wordpress.com)

Enjoy LessWrong in ebook format

Bart Bussmann13 Feb 2023 11:53 UTC

54 points

3 comments1 min readLW link

Morphological intelligence, superhuman empathy, and ethical arbitration

Roman Leventov13 Feb 2023 10:25 UTC

1 point

0 comments2 min readLW link

South Bay ACX/LW Meetup

IS13 Feb 2023 6:08 UTC

3 points

0 comments1 min readLW link

Idea: Network modularity and interpretability by sexual reproduction

qbolec12 Feb 2023 23:06 UTC

3 points

3 comments1 min readLW link

The End of Anonymity Online

Spiorad12 Feb 2023 21:23 UTC

3 points

9 comments2 min readLW link