All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025 2026

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 131415 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

D&D.Sci Hypersphere Analysis Part 1: Datafields & Preliminary Analysis

aphyer13 Jan 2024 20:16 UTC

29 points

1 comment5 min readLW link

Some additional SAE thoughts

Hoagy13 Jan 2024 19:31 UTC

31 points

4 comments13 min readLW link

AI #47: Meet the New Year

Zvi13 Jan 2024 16:20 UTC

36 points

7 comments57 min readLW link

(thezvi.wordpress.com)

Takeaways from the NeurIPS 2023 Trojan Detection Competition

mikes13 Jan 2024 12:35 UTC

20 points

2 comments1 min readLW link

(confirmlabs.org)

[Question] Why do so many think deception in AI is important?

Prometheus13 Jan 2024 8:14 UTC

24 points

12 comments1 min readLW link

Eliminating Cookie Banners is Hard

jefftk13 Jan 2024 3:00 UTC

23 points

15 comments3 min readLW link

(www.jefftk.com)

Introducing Alignment Stress-Testing at Anthropic

evhub12 Jan 2024 23:51 UTC

182 points

23 comments2 min readLW link

D&D.Sci(-fi): Colonizing the SuperHyperSphere

abstractapplic12 Jan 2024 23:36 UTC

48 points

23 comments2 min readLW link

Commonwealth Fusion Systems is the Same Scale as OpenAI

Jeffrey Heninger12 Jan 2024 21:43 UTC

22 points

13 comments2 min readLW link

Throughput vs. Latency

alkjash and Ruby

12 Jan 2024 21:37 UTC

31 points

2 comments13 min readLW link

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

evhub, Carson Denison, Meg, Monte M, David Duvenaud, Nicholas Schiefer and Ethan Perez

12 Jan 2024 19:51 UTC

310 points

95 comments3 min readLW link

(arxiv.org)

METAPHILOSOPHY—A Philosophizing through logical consequences

Seremonia12 Jan 2024 18:47 UTC

−7 points

7 comments1 min readLW link

Idealism, Realistic & Pragmatic

Seremonia12 Jan 2024 18:16 UTC

−7 points

3 comments1 min readLW link

The existential threat of humans.

Spiritus Dei12 Jan 2024 17:50 UTC

−24 points

0 comments3 min readLW link

[Question] Concrete examples of doing agentic things?

Jacob G-W12 Jan 2024 15:59 UTC

13 points

10 comments1 min readLW link

Land Reclamation is in the 9th Circle of Stagnation Hell

Maxwell Tabarrok12 Jan 2024 13:36 UTC

54 points

6 comments2 min readLW link

(maximumprogress.substack.com)

What good is G-factor if you’re dumped in the woods? A field report from a camp counselor.

Hastings12 Jan 2024 13:17 UTC

156 points

24 comments1 min readLW link

A Chinese Room Containing a Stack of Stochastic Parrots

RogerDearnaley12 Jan 2024 6:29 UTC

20 points

4 comments5 min readLW link 1 review

Decent plan prize announcement (1 paragraph, $1k)

lemonhope12 Jan 2024 6:27 UTC

25 points

19 comments1 min readLW link

introduction to solid oxide electrolytes

bhauth12 Jan 2024 5:35 UTC

17 points

0 comments4 min readLW link

(www.bhauth.com)

Apply to the 2024 PIBBSS Summer Research Fellowship

Nora_Ammann, DusanDNesic and Lucas Teixeira

12 Jan 2024 4:06 UTC

39 points

1 comment2 min readLW link

A Benchmark for Decision Theories

StrivingForLegibility11 Jan 2024 18:54 UTC

15 points

0 comments2 min readLW link

An even deeper atheism

Joe Carlsmith11 Jan 2024 17:28 UTC

125 points

48 comments15 min readLW link 1 review

Motivating Alignment of LLM-Powered Agents: Easy for AGI, Hard for ASI?

RogerDearnaley11 Jan 2024 12:56 UTC

36 points

4 comments39 min readLW link

Reprograming the Mind: Meditation as a Tool for Cognitive Optimization

Jonas Hallgren11 Jan 2024 12:03 UTC

34 points

3 comments11 min readLW link

AI-Generated Music for Learning

nomagicpill11 Jan 2024 4:11 UTC

9 points

1 comment1 min readLW link

(210ethan.github.io)

Introduce a Speed Maximum

jefftk11 Jan 2024 2:50 UTC

42 points

28 comments2 min readLW link

(www.jefftk.com)

[Question] Prediction markets are consistently underconfident. Why?

Sinclair Chen11 Jan 2024 2:44 UTC

11 points

4 comments1 min readLW link

Trying to align humans with inclusive genetic fitness

peterbarnett11 Jan 2024 0:13 UTC

23 points

5 comments10 min readLW link

Universal Love Integration Test: Hitler

Raemon10 Jan 2024 23:55 UTC

77 points

65 comments9 min readLW link

The Perceptron Controversy

Yuxi_Liu10 Jan 2024 23:07 UTC

65 points

18 comments1 min readLW link

(yuxi-liu-wired.github.io)

The Aspiring Rationalist Congregation

maia10 Jan 2024 22:52 UTC

91 points

23 comments10 min readLW link

An Actually Intuitive Explanation of the Oberth Effect

Isaac King10 Jan 2024 20:23 UTC

62 points

37 comments6 min readLW link

Beware the suboptimal routine

jwfiredragon10 Jan 2024 19:02 UTC

13 points

3 comments3 min readLW link

The true cost of fences

pleiotroth10 Jan 2024 19:01 UTC

3 points

2 comments4 min readLW link

“Dark Constitution” for constraining some superintelligences

Valentine10 Jan 2024 16:02 UTC

3 points

9 comments1 min readLW link

(www.anarchonomicon.com)

[Question] rabbit (a new AI company) and Large Action Model (LAM)

MiguelDev10 Jan 2024 13:57 UTC

17 points

3 comments1 min readLW link

Saving the world sucks

Defective Altruism10 Jan 2024 5:55 UTC

49 points

29 comments3 min readLW link

[Question] Questions about Solomonoff induction

mukashi10 Jan 2024 1:16 UTC

7 points

11 comments1 min readLW link

AI as a natural disaster

Neil 10 Jan 2024 0:42 UTC

11 points

1 comment7 min readLW link

Stop being surprised by the passage of time

duck_master and 00aleae

10 Jan 2024 0:36 UTC

−2 points

1 comment3 min readLW link

A discussion of normative ethics

Gordon Seidoh Worley and Adam Zerner

9 Jan 2024 23:29 UTC

10 points

6 comments25 min readLW link

On the Contrary, Steelmanning Is Normal; ITT-Passing Is Niche

Zack_M_Davis9 Jan 2024 23:12 UTC

41 points

35 comments4 min readLW link 3 reviews

[Question] What’s the protocol for if a novice has ML ideas that are unlikely to work, but might improve capabilities if they do work?

drocta9 Jan 2024 22:51 UTC

6 points

2 comments2 min readLW link

Goodbye, Shoggoth: The Stage, its Animatronics, & the Puppeteer – a New Metaphor

RogerDearnaley9 Jan 2024 20:42 UTC

48 points

8 comments36 min readLW link

Bent or Blunt Hoods?

jefftk9 Jan 2024 20:10 UTC

23 points

0 comments1 min readLW link

(www.jefftk.com)

2024 ACX Predictions: Blind/Buy/Sell/Hold

Zvi9 Jan 2024 19:30 UTC

33 points

2 comments31 min readLW link

(thezvi.wordpress.com)

Announcing the Double Crux Bot

sanyer, Sofia Vanhanen and sarah.bluhm

9 Jan 2024 18:54 UTC

53 points

11 comments3 min readLW link

Does AI risk “other” the AIs?

Joe Carlsmith9 Jan 2024 17:51 UTC

60 points

3 comments8 min readLW link

AI demands unprecedented reliability

Jono9 Jan 2024 16:30 UTC

22 points

5 comments2 min readLW link