All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025 2026

All Jan Feb MarAprMay Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 567 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Koan: divining alien datastructures from RAM activations

TsviBT5 Apr 2024 18:04 UTC

65 points

10 comments21 min readLW link

On the 2nd CWT with Jonathan Haidt

Zvi5 Apr 2024 17:30 UTC

27 points

3 comments33 min readLW link

(thezvi.wordpress.com)

End-to-end hacking with language models

tchauvin5 Apr 2024 15:06 UTC

29 points

0 comments8 min readLW link

Partial value takeover without world takeover

KatjaGrace5 Apr 2024 6:20 UTC

92 points

26 comments3 min readLW link 1 review

(worldspiritsockpuppet.com)

On Complexity Science

Garrett Baker5 Apr 2024 2:24 UTC

54 points

21 comments4 min readLW link

Using game theory to elect a centrist in the 2024 US Presidential Election

Ebenezer Dukakis5 Apr 2024 0:46 UTC

−1 points

0 comments8 min readLW link

New report: A review of the empirical evidence for existential risk from AI via misaligned power-seeking

Harlan and rosehadshar

4 Apr 2024 23:41 UTC

31 points

5 comments1 min readLW link

(blog.aiimpacts.org)

Quick evidence review of bulking & cutting

jp4 Apr 2024 21:43 UTC

41 points

9 comments4 min readLW link 2 reviews

LLMs for Alignment Research: a safety priority?

abramdemski4 Apr 2024 20:03 UTC

148 points

25 comments11 min readLW link

On Leif Wenar’s Absurdly Unconvincing Critique Of Effective Altruism

Bentham's Bulldog4 Apr 2024 19:01 UTC

8 points

2 comments14 min readLW link

Run evals on base models too!

orthonormal4 Apr 2024 18:43 UTC

51 points

6 comments1 min readLW link

Let’s Fund: Impact of our $1M crowdfunded grant to the Center for Clean Energy Innovation

Hauke Hillebrandt4 Apr 2024 16:28 UTC

5 points

0 comments5 min readLW link

(lets-fund.org)

The Buckling World Hypothesis—Visualising Vulnerable Worlds

Rosco-Hunter4 Apr 2024 15:51 UTC

−5 points

2 comments4 min readLW link

Can AI Transform the Electorate into a Citizen’s Assembly?

Rosco-Hunter4 Apr 2024 15:45 UTC

−6 points

0 comments4 min readLW link

AI Discrimination Requirements: A Regulatory Review

Deric Cheng and Elliot Mckernon

4 Apr 2024 15:43 UTC

7 points

0 comments6 min readLW link

Trying to Do More Good

jefftk4 Apr 2024 14:20 UTC

18 points

0 comments12 min readLW link

(www.jefftk.com)

Language and Capabilities: Testing LLM Mathematical Abilities Across Languages

Ethan Edwards4 Apr 2024 13:18 UTC

24 points

2 comments36 min readLW link

AI #58: Stargate AGI

Zvi4 Apr 2024 13:10 UTC

49 points

9 comments60 min readLW link

(thezvi.wordpress.com)

Cult of equilibrium

Templarrr4 Apr 2024 9:19 UTC

13 points

2 comments1 min readLW link

[Question] Should you refuse this bet in Technicolor Sleeping Beauty?

Ape in the coat4 Apr 2024 8:55 UTC

16 points

15 comments1 min readLW link

[Question] What’s with all the bans recently?

Gerald Monroe4 Apr 2024 6:16 UTC

63 points

83 comments4 min readLW link

Best in Class Life Improvement

sapphire4 Apr 2024 1:51 UTC

77 points

20 comments1 min readLW link

[Question] What is the purpose and application of AI Debate?

VojtaKovarik4 Apr 2024 0:38 UTC

13 points

9 comments1 min readLW link

Concrete empirical research projects in mechanistic anomaly detection

Erik Jenner, Viktor Rehnberg and Oliver Daniels

3 Apr 2024 23:07 UTC

43 points

3 comments10 min readLW link

A gentle introduction to mechanistic anomaly detection

Erik Jenner3 Apr 2024 23:06 UTC

74 points

2 comments11 min readLW link

$250K in Prizes: SafeBench Competition Announcement

ozhang3 Apr 2024 22:07 UTC

26 points

0 comments1 min readLW link

The Case for Predictive Models

Rubi J. Hudson3 Apr 2024 18:22 UTC

43 points

7 comments8 min readLW link

Book Review (mini): Co-Intelligence by Ethan Mollick

Darren McKee3 Apr 2024 17:33 UTC

9 points

2 comments1 min readLW link

Sparsify: A mechanistic interpretability research agenda

Lee Sharkey3 Apr 2024 12:34 UTC

97 points

23 comments22 min readLW link

Just because 2 things are opposites, doesn’t mean they’re just the same but flipped

Alok Singh3 Apr 2024 8:59 UTC

20 points

18 comments2 min readLW link

(alok.github.io)

Falling fertility explanations and Israel

Yair Halberstadt3 Apr 2024 3:27 UTC

31 points

5 comments2 min readLW link

Nature is an infinite sphere whose center is everywhere and circumference is nowhere

Alok Singh3 Apr 2024 2:24 UTC

11 points

2 comments3 min readLW link

The Rationalist Haggadot Collection

maia2 Apr 2024 20:02 UTC

28 points

1 comment1 min readLW link

(tigrennatenn.neocities.org)

[Question] How Often Does ¬Correlation ⇏ ¬Causation?

niplav2 Apr 2024 17:58 UTC

19 points

17 comments2 min readLW link

[EA xpost] The Rationale-Shaped Hole At The Heart Of Forecasting

dschwarz2 Apr 2024 17:40 UTC

23 points

2 comments2 min readLW link

(forum.effectivealtruism.org)

Religion = Cult + Culture

Eneasz2 Apr 2024 16:44 UTC

17 points

9 comments4 min readLW link

(deathisbad.substack.com)

BIDA Election Thoughts

jefftk2 Apr 2024 15:30 UTC

9 points

0 comments1 min readLW link

(www.jefftk.com)

Fertility Roundup #3

Zvi2 Apr 2024 14:50 UTC

19 points

11 comments31 min readLW link

(thezvi.wordpress.com)

What can we learn about childrearing from J. S. Mill?

Adam Scherlis2 Apr 2024 6:06 UTC

11 points

2 comments1 min readLW link

OMMC Announces RIP

Adam Scholl and aysja

1 Apr 2024 23:20 UTC

194 points

6 comments2 min readLW link 1 review

Coherence of Caches and Agents

johnswentworth1 Apr 2024 23:04 UTC

80 points

13 comments11 min readLW link

LessWrong: After Dark, a new side of LessWrong

So8res1 Apr 2024 22:44 UTC

36 points

6 comments1 min readLW link

Gradient Descent on the Human Brain

Jozdien and gaspode

1 Apr 2024 22:39 UTC

61 points

5 comments2 min readLW link

[Question] Do I count as e/acc for exclusion purposes?

denyeverywhere1 Apr 2024 21:18 UTC

1 point

31 comments1 min readLW link

Self Explaining Neural Networks, the interpretability technique no one seems to be talking about.

f3mi1 Apr 2024 20:52 UTC

6 points

0 comments4 min readLW link

Death with Awesomeness

osmarks1 Apr 2024 20:24 UTC

6 points

2 comments2 min readLW link

[GPT-4] On the Gradual Emergence of Mechanized Intellect: A Treatise from the Year 1924

tailcalled1 Apr 2024 19:14 UTC

11 points

0 comments2 min readLW link

Notes on Dwarkesh Patel’s Podcast with Sholto Douglas and Trenton Bricken

Zvi1 Apr 2024 19:10 UTC

41 points

1 comment16 min readLW link

(thezvi.wordpress.com)

So You Created a Sociopath—New Book Announcement!

Garrett Baker1 Apr 2024 18:02 UTC

53 points

3 comments1 min readLW link

Announcing Suffering For Good

Garrett Baker1 Apr 2024 17:08 UTC

76 points

5 comments1 min readLW link