All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Best-Responding Is Not Always the Best Response

StrivingForLegibilityJan 4, 2024, 11:30 PM

10 points

0 comments3 min readLW link

Safety Data Sheets for Optimization Processes

StrivingForLegibilityJan 4, 2024, 11:30 PM

15 points

1 comment4 min readLW link

The Gears of Argmax

StrivingForLegibilityJan 4, 2024, 11:30 PM

11 points

0 comments3 min readLW link

Cellular reprogramming, pneumatic launch systems, and terraforming Mars: Some things I learned about at Foresight Vision Weekend

jasoncrawfordJan 4, 2024, 7:33 PM

28 points

0 comments8 min readLW link

(rootsofprogress.org)

Deep atheism and AI risk

Joe CarlsmithJan 4, 2024, 6:58 PM

153 points

22 comments27 min readLW link

Some Vacation Photos

johnswentworthJan 4, 2024, 5:15 PM

83 points

0 comments1 min readLW link

AISN #29: Progress on the EU AI Act Plus, the NY Times sues OpenAI for Copyright Infringement, and Congressional Questions about Research Standards in AI Safety

Dan H and Corin Katzke

Jan 4, 2024, 4:09 PM

8 points

0 comments6 min readLW link

(newsletter.safe.ai)

EAG Bay Area Satellite event: AI Institution Design Hackathon 2024

beatrice@foresight.orgJan 4, 2024, 3:02 PM

1 point

0 comments1 min readLW link

AI #45: To Be Determined

ZviJan 4, 2024, 3:00 PM

52 points

4 comments31 min readLW link

(thezvi.wordpress.com)

Screen-supported Portable Monitor

jefftkJan 4, 2024, 1:50 PM

16 points

10 comments1 min readLW link

(www.jefftk.com)

[Question] Which investments for aligned-AI outcomes?

tailcalledJan 4, 2024, 1:28 PM

8 points

9 comments2 min readLW link

Non-alignment project ideas for making transformative AI go well

Lukas FinnvedenJan 4, 2024, 7:23 AM

44 points

1 comment LW link

(www.forethought.org)

Fact Checking and Retaliation Against Sources

jefftkJan 4, 2024, 12:41 AM

7 points

2 comments4 min readLW link

(www.jefftk.com)

Investigating Alternative Futures: Human and Superintelligence Interaction Scenarios

Hiroshi YamakawaJan 3, 2024, 11:46 PM

1 point

0 comments17 min readLW link

“Attitudes Toward Artificial General Intelligence: Results from American Adults 2021 and 2023”—call for reviewers (Seeds of Science)

rogersbaconJan 3, 2024, 8:11 PM

4 points

0 comments1 min readLW link

What’s up with LLMs representing XORs of arbitrary features?

Sam MarksJan 3, 2024, 7:44 PM

158 points

63 comments16 min readLW link

Spirit Airlines Merger Play

sapphireJan 3, 2024, 7:25 PM

5 points

12 comments1 min readLW link

$300 for the best sci-fi prompt: the results

RomanSJan 3, 2024, 7:10 PM

16 points

19 comments7 min readLW link

Agent membranes/boundaries and formalizing “safety”

ChipmonkJan 3, 2024, 5:55 PM

26 points

46 comments3 min readLW link

Safety First: safety before full alignment. The deontic sufficiency hypothesis.

ChipmonkJan 3, 2024, 5:55 PM

48 points

3 comments3 min readLW link

Practically A Book Review: Appendix to “Nonlinear’s Evidence: Debunking False and Misleading Claims” (ThingOfThings)

tailcalledJan 3, 2024, 5:07 PM

111 points

25 comments2 min readLW link

(thingofthings.substack.com)

Trivial Mathematics as a Path Forward

ACrackedPotJan 3, 2024, 4:41 PM

−4 points

2 comments2 min readLW link

Copyright Confrontation #1

ZviJan 3, 2024, 3:50 PM

34 points

7 comments18 min readLW link

(thezvi.wordpress.com)

[Question] Theoretically, could we balance the budget painlessly?

Logan ZoellnerJan 3, 2024, 2:46 PM

4 points

12 comments1 min readLW link

Johannes’ Biography

Johannes C. MayerJan 3, 2024, 1:27 PM

24 points

0 comments10 min readLW link

What Helped Me—Kale, Blood, CPAP, X-tiamine, Methylphenidate

Johannes C. MayerJan 3, 2024, 1:22 PM

35 points

12 comments2 min readLW link

[Question] Does LessWrong make a difference when it comes to AI alignment?

PhilosophicalSoulJan 3, 2024, 12:21 PM

18 points

13 comments1 min readLW link

[Question] Terminology: <something>-ware for ML?

Oliver SourbutJan 3, 2024, 11:42 AM

17 points

27 comments1 min readLW link

Trading off Lives

jefftkJan 3, 2024, 3:40 AM

53 points

12 comments2 min readLW link

(www.jefftk.com)

MonoPoly Restricted Trust

ymeskhoutJan 2, 2024, 11:02 PM

42 points

37 comments9 min readLW link

Agent membranes and causal distance

ChipmonkJan 2, 2024, 10:43 PM

20 points

3 comments3 min readLW link

Focusing on Mal-Alignment

John FisherJan 2, 2024, 7:51 PM

1 point

0 comments1 min readLW link

Gentleness and the artificial Other

Joe CarlsmithJan 2, 2024, 6:21 PM

313 points

33 comments11 min readLW link

Otherness and control in the age of AGI

Joe CarlsmithJan 2, 2024, 6:15 PM

43 points

0 comments7 min readLW link

Apologizing is a Core Rationalist Skill

johnswentworthJan 2, 2024, 5:47 PM

156 points

42 comments5 min readLW link

Cortés, AI Risk, and the Dynamics of Competing Conquerors

James_MillerJan 2, 2024, 4:37 PM

14 points

2 comments3 min readLW link

OpenAI’s Preparedness Framework: Praise & Recommendations

Orpheus16Jan 2, 2024, 4:20 PM

66 points

1 comment7 min readLW link

Dating Roundup #2: If At First You Don’t Succeed

ZviJan 2, 2024, 4:00 PM

54 points

29 comments47 min readLW link

(thezvi.wordpress.com)

Looking for Reading Recommendations: Content Moderation, Power & Censorship

Joerg WeissJan 2, 2024, 11:37 AM

2 points

7 comments1 min readLW link

AI Is Not Software

DavidmanheimJan 2, 2024, 7:58 AM

58 points

29 comments5 min readLW link

Are Metaculus AI Timelines Inconsistent?

Chris_LeongJan 2, 2024, 6:47 AM

17 points

7 comments2 min readLW link

Boston Solstice 2023 Retrospective

jefftkJan 2, 2024, 3:10 AM

33 points

0 comments6 min readLW link

(www.jefftk.com)

Steering Llama-2 with contrastive activation additions

Nina Panickssery, Wuschel Schulz, NickGabs, Meg, evhub and TurnTrout

2 Jan 2024 0:47 UTC

125 points

29 comments8 min readLW link

(arxiv.org)

Twin Cities ACX Meetup—January 2024

Timothy M.1 Jan 2024 21:13 UTC

1 point

2 comments1 min readLW link

San Francisco ACX Meetup “First Saturday”

guenael1 Jan 2024 20:58 UTC

1 point

1 comment1 min readLW link

Mech Interp Challenge: January—Deciphering the Caesar Cipher Model

CallumMcDougall1 Jan 2024 18:03 UTC

17 points

0 comments3 min readLW link

Aldix and the Book of Life

ville1 Jan 2024 17:23 UTC

1 point

0 comments4 min readLW link

(medium.com)

Metaculus Hosts ACX 2024 Prediction Contest

ChristianWilliams1 Jan 2024 16:38 UTC

4 points

0 comments LW link

(www.metaculus.com)

The Act Itself: Exceptionless Moral Norms

SebastianG 1 Jan 2024 16:06 UTC

5 points

11 comments6 min readLW link

Deception Chess

Chris Land1 Jan 2024 15:40 UTC

7 points

2 comments4 min readLW link