All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025 2026

All Jan Feb Mar Apr May JunJulAug Sep Oct Nov Dec

All 1 234 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Announcing the AI Forecasting Benchmark Series | July 8, $120k in Prizes

ChristianWilliams2 Jul 2024 22:33 UTC

15 points

0 comments5 min readLW link

(www.metaculus.com)

Open Sourcing Metaculus

ChristianWilliams2 Jul 2024 22:30 UTC

44 points

0 comments2 min readLW link

(www.metaculus.com)

[Question] Why Can’t Sub-AGI Solve AI Alignment? Or: Why Would Sub-AGI AI Not be Aligned?

MrThink2 Jul 2024 20:13 UTC

4 points

23 comments1 min readLW link

[Question] Why haven’t there been assassination attempts against high profile AI accelerationists like sam altman yet?

louisTrem2 Jul 2024 18:16 UTC

−13 points

4 comments2 min readLW link

How ARENA course material gets made

CallumMcDougall2 Jul 2024 18:04 UTC

41 points

2 comments7 min readLW link

An AI Race With China Can Be Better Than Not Racing

niplav2 Jul 2024 17:57 UTC

68 points

36 comments11 min readLW link

List of Collective Intelligence Projects

Chris Lakin2 Jul 2024 14:10 UTC

42 points

9 comments2 min readLW link

(chrislakin.blog)

Decomposing the QK circuit with Bilinear Sparse Dictionary Learning

keith_wynroe and Lee Sharkey

2 Jul 2024 13:17 UTC

87 points

7 comments12 min readLW link

Economics Roundup #2

Zvi2 Jul 2024 12:40 UTC

35 points

5 comments23 min readLW link

(thezvi.wordpress.com)

How Congressional Offices Process Constituent Communication

T_W2 Jul 2024 12:38 UTC

30 points

1 comment6 min readLW link 1 review

OthelloGPT learned a bag of heuristics

Jennifer Lin, JackS, Adam Karvonen and Can

2 Jul 2024 9:12 UTC

111 points

10 comments9 min readLW link

Covert Malicious Finetuning

Tony Wang and dannyhalawi

2 Jul 2024 2:41 UTC

103 points

4 comments3 min readLW link

Interpreting Preference Models w/ Sparse Autoencoders

Logan Riggs and Jannik Brinkmann

1 Jul 2024 21:35 UTC

75 points

12 comments9 min readLW link

Honest science is spirituality

pchvykov1 Jul 2024 20:33 UTC

−1 points

10 comments4 min readLW link

New Executive Team & Board — PIBBSS

Nora_Ammann1 Jul 2024 19:30 UTC

43 points

1 comment1 min readLW link

Uncursing Civilization

Lorec1 Jul 2024 18:44 UTC

0 points

3 comments5 min readLW link

[Question] Self-censoring on AI x-risk discussions?

Decaeneus1 Jul 2024 18:24 UTC

17 points

2 comments1 min readLW link

Rationalists As People Who Build Piles Of Rocks

Sable1 Jul 2024 10:32 UTC

11 points

0 comments5 min readLW link

(affablyevil.substack.com)

How good are LLMs at doing ML on an unknown dataset?

Håvard Tveit Ihle1 Jul 2024 9:04 UTC

33 points

4 comments13 min readLW link

Whirlwind Tour of Chain of Thought Literature Relevant to Automating Alignment Research.

sevdeawesome1 Jul 2024 5:50 UTC

25 points

0 comments17 min readLW link

Probabilistic Logic ⇔ Oracles?

yudhister1 Jul 2024 5:36 UTC

23 points

0 comments4 min readLW link

Important open problems in voting

Closed Limelike Curves1 Jul 2024 2:53 UTC

33 points

1 comment1 min readLW link

In Defense of Lawyers Playing Their Part

Isaac King1 Jul 2024 1:32 UTC

32 points

9 comments9 min readLW link

Review of METR’s public evaluation protocol

nahoj and JaimeRV

30 Jun 2024 22:03 UTC

10 points

0 comments5 min readLW link

Superposition, Self-Modeling, and the Path to AGI: A New Perspective

Peterpiper30 Jun 2024 17:20 UTC

−13 points

0 comments2 min readLW link

The Xerox Parc/ARPA version of the intellectual Turing test: Class 1 vs Class 2 disagreement

hamishtodd130 Jun 2024 15:34 UTC

6 points

3 comments1 min readLW link

LLMs Universally Learn a Feature Representing Token Frequency / Rarity

Sean Osier30 Jun 2024 2:48 UTC

13 points

5 comments6 min readLW link

(github.com)

My 5-step program for losing weight

nsokolsky30 Jun 2024 1:05 UTC

22 points

20 comments5 min readLW link

(nsokolsky.substack.com)

Datasets that change the odds you exist

dynomight29 Jun 2024 18:45 UTC

56 points

4 comments6 min readLW link

(dynomight.net)

A “Scaling Monosemanticity” Explainer

latterframe and TheoR

29 Jun 2024 17:50 UTC

10 points

0 comments3 min readLW link

Analysis of key AI analogies

Kevin Kohler29 Jun 2024 10:55 UTC

10 points

2 comments15 min readLW link

Georgism Crash Course

Zero Contradictions29 Jun 2024 6:18 UTC

9 points

5 comments1 min readLW link

(zerocontradictions.net)

Activation Pattern SVD: A proposal for SAE Interpretability

Daniel Tan28 Jun 2024 22:12 UTC

15 points

2 comments2 min readLW link

Podcast: Elizabeth & Austin on “What Manifold was allowed to do”

Austin Chen28 Jun 2024 22:10 UTC

20 points

0 comments24 min readLW link

(share.descript.com)

The Incredible Fentanyl-Detecting Machine

sarahconstantin28 Jun 2024 22:10 UTC

158 points

26 comments7 min readLW link

(sarahconstantin.substack.com)

Saving Lives Reduces Over-Population—A Counter-Intuitive Non-Zero-Sum Game

James Stephen Brown28 Jun 2024 19:29 UTC

6 points

0 comments5 min readLW link

(nonzerosum.games)

Mentorship in AGI Safety: Applications for mentorship are open!

Valentin2026 and Joe Rogero

28 Jun 2024 14:49 UTC

5 points

0 comments1 min readLW link

Contra Acemoglu on AI

Maxwell Tabarrok28 Jun 2024 13:13 UTC

48 points

0 comments5 min readLW link

(www.maximum-progress.com)

Five toy worlds to think about heritability

David Hugh-Jones28 Jun 2024 13:11 UTC

13 points

0 comments9 min readLW link

(wyclif.substack.com)

[Question] How do natural sciences prove causation?

Kongo Landwalker28 Jun 2024 11:58 UTC

1 point

3 comments1 min readLW link

LessWrong/ACX meetup Transilvanya tour—Sibiu

Marius Adrian Nicoară28 Jun 2024 11:41 UTC

1 point

1 comment1 min readLW link

Bayes’ Theorem: In Search of Gold (Lesson 1)

bayesyatina28 Jun 2024 8:39 UTC

3 points

0 comments3 min readLW link

How a chip is designed

YM28 Jun 2024 8:04 UTC

65 points

4 comments5 min readLW link

The Wisdom of Living for 200 Years

Martin Sustrik28 Jun 2024 4:44 UTC

25 points

3 comments4 min readLW link

A Generally Intelligent Game

snerx28 Jun 2024 1:31 UTC

−1 points

1 comment4 min readLW link

Corrigibility = Tool-ness?

johnswentworth and David Lorell

28 Jun 2024 1:19 UTC

85 points

8 comments9 min readLW link

Situational Awareness

PeterMcCluskey28 Jun 2024 1:08 UTC

11 points

0 comments12 min readLW link

(bayesianinvestor.com)

Toward a taxonomy of cognitive benchmarks for agentic AGIs

Ben Smith27 Jun 2024 23:50 UTC

15 points

0 comments5 min readLW link

How Big a Deal are MatMul-Free Transformers?

JustisMills27 Jun 2024 22:28 UTC

19 points

6 comments5 min readLW link

(justismills.substack.com)

Secondary forces of debt

KatjaGrace27 Jun 2024 21:10 UTC

81 points

21 comments2 min readLW link

(worldspiritsockpuppet.com)