All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025 2026

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 141516 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Making up statistics to establish priority on Land Value Tax vs Earned Income Tax Credit vs Social Media Dynamic Regulation

Canucklug14 Jan 2024 23:57 UTC

−5 points

2 comments7 min readLW link

Is the universe all there is? ‘Evidence’ for objects outside the universe...

JonathanHall14 Jan 2024 23:56 UTC

−4 points

27 comments11 min readLW link

[Question] What is the minimum amount of time travel and resources needed to secure the future?

Perhaps14 Jan 2024 22:01 UTC

−3 points

5 comments1 min readLW link

Gothenburg LW / ACX meetup

Stefan14 Jan 2024 21:21 UTC

1 point

0 comments1 min readLW link

Gothenburg LW / ACX meetup

Stefan14 Jan 2024 21:20 UTC

1 point

1 comment1 min readLW link

D&D.Sci Hypersphere Analysis Part 2: Nonlinear Effects & Interactions

aphyer14 Jan 2024 19:59 UTC

24 points

0 comments7 min readLW link

Gender Exploration

sapphire14 Jan 2024 18:57 UTC

122 points

27 comments5 min readLW link 1 review

(open.substack.com)

List of projects that seem impactful for AI Governance

JaimeRV and Teun van der Weij

14 Jan 2024 16:53 UTC

14 points

0 comments13 min readLW link

The Leeroy Jenkins principle: How faulty AI could guarantee “warning shots”

titotal14 Jan 2024 15:03 UTC

48 points

6 comments21 min readLW link

(titotal.substack.com)

Notice When People Are Directionally Correct

Chris_Leong14 Jan 2024 14:12 UTC

157 points

15 comments2 min readLW link

Corrosive Mnemonics

Epirito14 Jan 2024 12:44 UTC

7 points

0 comments2 min readLW link

Against most, but not all, AI risk analogies

Matthew Barnett14 Jan 2024 3:36 UTC

63 points

41 comments7 min readLW link

Vote With Your Face

jefftk14 Jan 2024 3:30 UTC

11 points

0 comments1 min readLW link

(www.jefftk.com)

Case Studies in Reverse-Engineering Sparse Autoencoder Features by Using MLP Linearization

Jacob Dunefsky, Philippe Chlenski, Senthooran Rajamanoharan and Neel Nanda

14 Jan 2024 2:06 UTC

24 points

0 comments42 min readLW link

D&D.Sci Hypersphere Analysis Part 1: Datafields & Preliminary Analysis

aphyer13 Jan 2024 20:16 UTC

29 points

1 comment5 min readLW link

Some additional SAE thoughts

Hoagy13 Jan 2024 19:31 UTC

31 points

4 comments13 min readLW link

AI #47: Meet the New Year

Zvi13 Jan 2024 16:20 UTC

36 points

7 comments57 min readLW link

(thezvi.wordpress.com)

Takeaways from the NeurIPS 2023 Trojan Detection Competition

mikes13 Jan 2024 12:35 UTC

20 points

2 comments1 min readLW link

(confirmlabs.org)

[Question] Why do so many think deception in AI is important?

Prometheus13 Jan 2024 8:14 UTC

24 points

12 comments1 min readLW link

Eliminating Cookie Banners is Hard

jefftk13 Jan 2024 3:00 UTC

23 points

15 comments3 min readLW link

(www.jefftk.com)

Introducing Alignment Stress-Testing at Anthropic

evhub12 Jan 2024 23:51 UTC

182 points

23 comments2 min readLW link

D&D.Sci(-fi): Colonizing the SuperHyperSphere

abstractapplic12 Jan 2024 23:36 UTC

48 points

23 comments2 min readLW link

Commonwealth Fusion Systems is the Same Scale as OpenAI

Jeffrey Heninger12 Jan 2024 21:43 UTC

22 points

13 comments2 min readLW link

Throughput vs. Latency

alkjash and Ruby

12 Jan 2024 21:37 UTC

31 points

2 comments13 min readLW link

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

evhub, Carson Denison, Meg, Monte M, David Duvenaud, Nicholas Schiefer and Ethan Perez

12 Jan 2024 19:51 UTC

310 points

95 comments3 min readLW link

(arxiv.org)

METAPHILOSOPHY—A Philosophizing through logical consequences

Seremonia12 Jan 2024 18:47 UTC

−7 points

7 comments1 min readLW link

Idealism, Realistic & Pragmatic

Seremonia12 Jan 2024 18:16 UTC

−7 points

3 comments1 min readLW link

The existential threat of humans.

Spiritus Dei12 Jan 2024 17:50 UTC

−24 points

0 comments3 min readLW link

[Question] Concrete examples of doing agentic things?

Jacob G-W12 Jan 2024 15:59 UTC

13 points

10 comments1 min readLW link

Land Reclamation is in the 9th Circle of Stagnation Hell

Maxwell Tabarrok12 Jan 2024 13:36 UTC

54 points

6 comments2 min readLW link

(maximumprogress.substack.com)

What good is G-factor if you’re dumped in the woods? A field report from a camp counselor.

Hastings12 Jan 2024 13:17 UTC

156 points

24 comments1 min readLW link

A Chinese Room Containing a Stack of Stochastic Parrots

RogerDearnaley12 Jan 2024 6:29 UTC

21 points

4 comments5 min readLW link 1 review

Decent plan prize announcement (1 paragraph, $1k)

lemonhope12 Jan 2024 6:27 UTC

25 points

19 comments1 min readLW link

introduction to solid oxide electrolytes

bhauth12 Jan 2024 5:35 UTC

17 points

0 comments4 min readLW link

(www.bhauth.com)

Apply to the 2024 PIBBSS Summer Research Fellowship

Nora_Ammann, DusanDNesic and Lucas Teixeira

12 Jan 2024 4:06 UTC

39 points

1 comment2 min readLW link

A Benchmark for Decision Theories

StrivingForLegibility11 Jan 2024 18:54 UTC

15 points

0 comments2 min readLW link

An even deeper atheism

Joe Carlsmith11 Jan 2024 17:28 UTC

125 points

48 comments15 min readLW link 1 review

Motivating Alignment of LLM-Powered Agents: Easy for AGI, Hard for ASI?

RogerDearnaley11 Jan 2024 12:56 UTC

37 points

4 comments39 min readLW link

Reprograming the Mind: Meditation as a Tool for Cognitive Optimization

Jonas Hallgren11 Jan 2024 12:03 UTC

34 points

3 comments11 min readLW link

AI-Generated Music for Learning

nomagicpill11 Jan 2024 4:11 UTC

9 points

1 comment1 min readLW link

(210ethan.github.io)

Introduce a Speed Maximum

jefftk11 Jan 2024 2:50 UTC

42 points

28 comments2 min readLW link

(www.jefftk.com)

[Question] Prediction markets are consistently underconfident. Why?

Sinclair Chen11 Jan 2024 2:44 UTC

11 points

4 comments1 min readLW link

Trying to align humans with inclusive genetic fitness

peterbarnett11 Jan 2024 0:13 UTC

23 points

5 comments10 min readLW link

Universal Love Integration Test: Hitler

Raemon10 Jan 2024 23:55 UTC

77 points

65 comments9 min readLW link

The Perceptron Controversy

Yuxi_Liu10 Jan 2024 23:07 UTC

65 points

18 comments1 min readLW link

(yuxi-liu-wired.github.io)

The Aspiring Rationalist Congregation

maia10 Jan 2024 22:52 UTC

91 points

25 comments10 min readLW link

An Actually Intuitive Explanation of the Oberth Effect

Isaac King10 Jan 2024 20:23 UTC

62 points

37 comments6 min readLW link

Beware the suboptimal routine

jwfiredragon10 Jan 2024 19:02 UTC

13 points

3 comments3 min readLW link

The true cost of fences

pleiotroth10 Jan 2024 19:01 UTC

3 points

2 comments4 min readLW link

“Dark Constitution” for constraining some superintelligences

Valentine10 Jan 2024 16:02 UTC

3 points

9 comments1 min readLW link

(www.anarchonomicon.com)