All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

AllJan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

The likely first longevity drug is based on sketchy science. This is bad for science and bad for longevity.

BobBurgersDec 12, 2023, 2:42 AM

161 points

34 comments5 min readLW link

AI doom from an LLM-plateau-ist perspective

Steven ByrnesApr 27, 2023, 1:58 PM

161 points

24 comments6 min readLW link

Meta Questions about Metaphilosophy

Wei DaiSep 1, 2023, 1:17 AM

161 points

80 comments3 min readLW link

Change my mind: Veganism entails trade-offs, and health is one of the axes

ElizabethJun 1, 2023, 5:10 PM

160 points

85 comments19 min readLW link 2 reviews

(acesounderglass.com)

Jailbreaking GPT-4′s code interpreter

Nikola JurkovicJul 13, 2023, 6:43 PM

160 points

22 comments7 min readLW link

Agentized LLMs will change the alignment landscape

Seth HerdApr 9, 2023, 2:29 AM

160 points

102 comments3 min readLW link 1 review

“Diamondoid bacteria” nanobots: deadly threat or dead-end? A nanotech investigation

titotalSep 29, 2023, 2:01 PM

160 points

79 comments LW link

(titotal.substack.com)

Sparse Autoencoders Find Highly Interpretable Directions in Language Models

Logan Riggs, Hoagy, Aidan Ewart and Robert_AIZI

Sep 21, 2023, 3:30 PM

159 points

8 comments5 min readLW link

Vote on Interesting Disagreements

Ben PaceNov 7, 2023, 9:35 PM

159 points

131 comments1 min readLW link

Most People Don’t Realize We Have No Idea How Our AIs Work

Thane RuthenisDec 21, 2023, 8:02 PM

159 points

42 comments1 min readLW link

Succession

Richard_NgoDec 20, 2023, 7:25 PM

159 points

48 comments11 min readLW link

(www.narrativeark.xyz)

POC || GTFO culture as partial antidote to alignment wordcelism

lcMar 15, 2023, 10:21 AM

158 points

15 comments7 min readLW link 2 reviews

Big Mac Subsidy?

jefftkFeb 23, 2023, 4:00 AM

158 points

25 comments2 min readLW link

(www.jefftk.com)

What would a compute monitoring plan look like? [Linkpost]

Orpheus16Mar 26, 2023, 7:33 PM

158 points

10 comments4 min readLW link

(arxiv.org)

Inside the mind of a superhuman Go model: How does Leela Zero read ladders?

Haoxing DuMar 1, 2023, 1:47 AM

157 points

8 comments30 min readLW link

My thoughts on the social response to AI risk

Matthew BarnettNov 1, 2023, 9:17 PM

157 points

37 comments10 min readLW link

Password-locked models: a stress case for capabilities evaluation

Fabien RogerAug 3, 2023, 2:53 PM

156 points

14 comments6 min readLW link

grey goo is unlikely

bhauthApr 17, 2023, 1:59 AM

156 points

123 comments9 min readLW link 2 reviews

(bhauth.com)

Sapir-Whorf for Rationalists

Duncan Sabien (Inactive)Jan 25, 2023, 7:58 AM

155 points

49 comments19 min readLW link

Conjecture internal survey: AGI timelines and probability of human extinction from advanced AI

Maris SalaMay 22, 2023, 2:31 PM

155 points

5 comments3 min readLW link

(www.conjecture.dev)

Announcing Dialogues

Ben PaceOct 7, 2023, 2:57 AM

155 points

59 comments4 min readLW link

AI: Practical Advice for the Worried

ZviMar 1, 2023, 12:30 PM

155 points

49 comments16 min readLW link 2 reviews

(thezvi.wordpress.com)

The self-unalignment problem

Jan_Kulveit and rosehadshar

Apr 14, 2023, 12:10 PM

155 points

24 comments10 min readLW link

Request: stop advancing AI capabilities

So8resMay 26, 2023, 5:42 PM

154 points

24 comments1 min readLW link

A freshman year during the AI midgame: my approach to the next year

BuckApr 14, 2023, 12:38 AM

154 points

15 comments LW link 1 review

Will no one rid me of this turbulent pest?

MetacelsusOct 14, 2023, 3:27 PM

154 points

23 comments10 min readLW link

(denovo.substack.com)

ARC Evals new report: Evaluating Language-Model Agents on Realistic Autonomous Tasks

Beth BarnesAug 1, 2023, 6:30 PM

153 points

12 comments5 min readLW link

(evals.alignment.org)

Assume Bad Faith

Zack_M_DavisAug 25, 2023, 5:36 PM

153 points

63 comments7 min readLW link 3 reviews

The Plan − 2023 Version

johnswentworthDec 29, 2023, 11:34 PM

152 points

40 comments31 min readLW link 1 review

Shutting down AI is not enough. We need to destroy all technology.

Matthew BarnettApr 1, 2023, 9:03 PM

152 points

36 comments1 min readLW link

LoRA Fine-tuning Efficiently Undoes Safety Training from Llama 2-Chat 70B

Simon Lermen and Jeffrey Ladish

Oct 12, 2023, 7:58 PM

151 points

29 comments14 min readLW link

GPT-4

nzMar 14, 2023, 5:02 PM

151 points

150 comments1 min readLW link

(openai.com)

AI x-risk, approximately ordered by embarrassment

Alex Lawsen Apr 12, 2023, 11:01 PM

151 points

7 comments19 min readLW link

Why Not Just Outsource Alignment Research To An AI?

johnswentworthMar 9, 2023, 9:49 PM

151 points

50 comments9 min readLW link 1 review

Advice for newly busy people

Severin T. SeehrichMay 11, 2023, 4:46 PM

150 points

3 comments5 min readLW link

OpenAI Launches Superalignment Taskforce

ZviJul 11, 2023, 1:00 PM

150 points

40 comments49 min readLW link

(thezvi.wordpress.com)

Why I’m not into the Free Energy Principle

Steven ByrnesMar 2, 2023, 7:27 PM

150 points

50 comments9 min readLW link 1 review

There are no coherence theorems

Feb 20, 2023, 9:25 PM

149 points

130 comments19 min readLW link 1 review

Moral Reality Check (a short story)

jessicataNov 26, 2023, 5:03 AM

149 points

45 comments21 min readLW link 1 review

(unstableontology.com)

The U.S. is becoming less stable

lcAug 18, 2023, 9:13 PM

149 points

68 comments2 min readLW link

Dan Luu on “You can only communicate one top priority”

RaemonMar 18, 2023, 6:55 PM

149 points

18 comments3 min readLW link

(twitter.com)

Brain Efficiency Cannell Prize Contest Award Ceremony

Alexander Gietelink OldenzielJul 24, 2023, 11:30 AM

149 points

12 comments7 min readLW link

Comments on OpenAI’s “Planning for AGI and beyond”

So8resMar 3, 2023, 11:01 PM

148 points

2 comments14 min readLW link

At 87, Pearl is still able to change his mind

rotatingpaguro18 Oct 2023 4:46 UTC

148 points

15 comments5 min readLW link

Could a superintelligence deduce general relativity from a falling apple? An investigation

titotal23 Apr 2023 12:49 UTC

148 points

39 comments9 min readLW link

Discussion: Challenges with Unsupervised LLM Knowledge Discovery

Seb Farquhar, Vikrant Varma, zac_kenton, gasteigerjo, Vlad Mikulik and Rohin Shah

18 Dec 2023 11:58 UTC

147 points

21 comments10 min readLW link

6 non-obvious mental health issues specific to AI safety

Igor Ivanov18 Aug 2023 15:46 UTC

147 points

24 comments4 min readLW link

“Heretical Thoughts on AI” by Eli Dourado

DragonGod19 Jan 2023 16:11 UTC

146 points

38 comments3 min readLW link

(www.elidourado.com)

Does davidad’s uploading moonshot work?

Bird Concept, lisathiergart, Anders_Sandberg, davidad and Arenamontanus

3 Nov 2023 2:21 UTC

146 points

35 comments25 min readLW link

Algorithmic Improvement Is Probably Faster Than Scaling Now

johnswentworth6 Jun 2023 2:57 UTC

146 points

25 comments2 min readLW link