All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar Apr May JunJulAug Sep Oct Nov Dec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Douglas Hofstadter changes his mind on Deep Learning & AI risk (June 2023)?

gwernJul 3, 2023, 12:48 AM

426 points

54 comments7 min readLW link

(www.youtube.com)

Alignment Grantmaking is Funding-Limited Right Now

johnswentworthJul 19, 2023, 4:49 PM

312 points

68 comments1 min readLW link

Accidentally Load Bearing

jefftkJul 13, 2023, 4:10 PM

287 points

18 comments1 min readLW link 1 review

(www.jefftk.com)

Yes, It’s Subjective, But Why All The Crabs?

johnswentworthJul 28, 2023, 7:35 PM

250 points

15 comments6 min readLW link

Cultivating a state of mind where new ideas are born

Henrik KarlssonJul 27, 2023, 9:16 AM

244 points

21 comments14 min readLW link 2 reviews

(www.henrikkarlsson.xyz)

Self-driving car bets

paulfchristianoJul 29, 2023, 6:10 PM

236 points

44 comments5 min readLW link

(sideways-view.com)

Ways I Expect AI Regulation To Increase Extinction Risk

1a3ornJul 4, 2023, 5:32 PM

226 points

32 comments7 min readLW link

Consciousness as a conflationary alliance term for intrinsically valued internal experiences

Andrew_CritchJul 10, 2023, 8:09 AM

214 points

54 comments11 min readLW link 2 reviews

My “2.9 trauma limit”

RaemonJul 1, 2023, 7:32 PM

196 points

31 comments7 min readLW link

Towards Developmental Interpretability

Jesse Hoogland, Alexander Gietelink Oldenziel, Daniel Murfet and Stan van Wingerden

Jul 12, 2023, 7:33 PM

192 points

10 comments9 min readLW link 1 review

Grant applications and grand narratives

ElizabethJul 2, 2023, 12:16 AM

191 points

22 comments6 min readLW link

Cryonics and Regret

MvBJul 24, 2023, 9:16 AM

190 points

35 comments2 min readLW link 1 review

[Linkpost] Introducing Superalignment

berenJul 5, 2023, 6:23 PM

175 points

69 comments1 min readLW link

(openai.com)

Rationality !== Winning

RaemonJul 24, 2023, 2:53 AM

170 points

51 comments4 min readLW link

Why it’s so hard to talk about Consciousness

Rafael HarthJul 2, 2023, 3:56 PM

167 points

215 comments9 min readLW link 3 reviews

When can we trust model evaluations?

evhubJul 28, 2023, 7:42 PM

166 points

10 comments10 min readLW link 1 review

Jailbreaking GPT-4′s code interpreter

Nikola JurkovicJul 13, 2023, 6:43 PM

160 points

22 comments7 min readLW link

OpenAI Launches Superalignment Taskforce

ZviJul 11, 2023, 1:00 PM

150 points

40 comments49 min readLW link

(thezvi.wordpress.com)

Brain Efficiency Cannell Prize Contest Award Ceremony

Alexander Gietelink OldenzielJul 24, 2023, 11:30 AM

149 points

12 comments7 min readLW link

The Goddess of Everything Else—The Animation

WriterJul 13, 2023, 4:26 PM

142 points

4 comments1 min readLW link

(youtu.be)

The Seeker’s Game – Vignettes from the Bay

YuliaJul 9, 2023, 7:32 PM

141 points

19 comments16 min readLW link

Going Crazy and Getting Better Again

EvenstarJul 2, 2023, 6:55 PM

139 points

13 comments7 min readLW link 1 review

Ten Levels of AI Alignment Difficulty

Sammy MartinJul 3, 2023, 8:20 PM

138 points

24 comments12 min readLW link 1 review

Neuronpedia

Johnny LinJul 26, 2023, 4:29 PM

135 points

51 comments2 min readLW link

(neuronpedia.org)

How LLMs are and are not myopic

janusJul 25, 2023, 2:19 AM

135 points

16 comments8 min readLW link

Views on when AGI comes and on strategy to reduce existential risk

TsviBTJul 8, 2023, 9:00 AM

133 points

61 comments14 min readLW link 1 review

Introducing Fatebook: the fastest way to make and track predictions

Adam B and Sage Future

Jul 11, 2023, 3:28 PM

132 points

41 comments1 min readLW link 2 reviews

(fatebook.io)

Even Superhuman Go AIs Have Surprising Failure Modes

AdamGleave, EuanMcLean, Tony Wang, Kellin Pelrine, Tom Tseng, Yawen Duan, Joseph Miller and MichaelDennis

Jul 20, 2023, 5:31 PM

130 points

22 comments10 min readLW link

(far.ai)

Reducing sycophancy and improving honesty via activation steering

Nina PanicksseryJul 28, 2023, 2:46 AM

122 points

18 comments9 min readLW link 1 review

Why was the AI Alignment community so unprepared for this moment?

Ras1513Jul 15, 2023, 12:26 AM

121 points

65 comments2 min readLW link

“Reframing Superintelligence” + LLMs + 4 years

Eric DrexlerJul 10, 2023, 1:42 PM

118 points

9 comments12 min readLW link

Winners of AI Alignment Awards Research Contest

Orpheus16 and OliviaJ

Jul 13, 2023, 4:14 PM

115 points

4 comments12 min readLW link

(alignmentawards.com)

Introducing bayescalc.io

Adele LopezJul 7, 2023, 4:11 PM

115 points

29 comments1 min readLW link

(bayescalc.io)

QAPR 5: grokking is maybe not that big a deal?

Quintin PopeJul 23, 2023, 8:14 PM

114 points

15 comments9 min readLW link

Measuring and Improving the Faithfulness of Model-Generated Reasoning

Ansh Radhakrishnan, tamera, karinanguyen, Sam Bowman and Ethan Perez

Jul 18, 2023, 4:36 PM

111 points

15 comments6 min readLW link 1 review

Priorities for the UK Foundation Models Taskforce

Andrea_MiottiJul 21, 2023, 3:23 PM

105 points

4 comments5 min readLW link

(www.conjecture.dev)

Consider Joining the UK Foundation Model Taskforce

ZviJul 10, 2023, 1:50 PM

105 points

12 comments1 min readLW link

(thezvi.wordpress.com)

A transcript of the TED talk by Eliezer Yudkowsky

Mikhail SaminJul 12, 2023, 12:12 PM

105 points

13 comments4 min readLW link

Anthropic Observations

ZviJul 25, 2023, 12:50 PM

104 points

1 comment10 min readLW link

(thezvi.wordpress.com)

Fixed Point: a love story

Richard_NgoJul 8, 2023, 1:56 PM

99 points

2 comments7 min readLW link

Meta-level adversarial evaluation of oversight techniques might allow robust measurement of their adequacy

Buck and ryan_greenblatt

Jul 26, 2023, 5:02 PM

99 points

19 comments1 min readLW link 1 review

When Someone Tells You They’re Lying, Believe Them

ymeskhoutJul 14, 2023, 12:31 AM

95 points

3 comments3 min readLW link

“Justice, Cherryl.”

Zack_M_DavisJul 23, 2023, 4:16 PM

91 points

21 comments9 min readLW link 1 review

BCIs and the ecosystem of modular minds

berenJul 21, 2023, 3:58 PM

88 points

14 comments11 min readLW link

Apollo Neuro Results

ElizabethJul 30, 2023, 6:40 PM

85 points

17 comments3 min readLW link

(acesounderglass.com)

[Question] What Does LessWrong/EA Think of Human Intelligence Augmentation as of mid-2023?

lukemarksJul 8, 2023, 11:42 AM

84 points

28 comments2 min readLW link

Underwater Torture Chambers: The Horror Of Fish Farming

omnizoidJul 26, 2023, 12:27 AM

83 points

50 comments10 min readLW link 1 review

Sapient Algorithms

ValentineJul 17, 2023, 4:30 PM

82 points

15 comments5 min readLW link

A $10k retroactive grant for VaccinateCA

Austin ChenJul 27, 2023, 6:14 PM

82 points

0 comments LW link

(manifund.org)

Compute Thresholds: proposed rules to mitigate risk of a “lab leak” accident during AI training runs

davidadJul 22, 2023, 6:09 PM

80 points

2 comments2 min readLW link