All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025 2026

All Jan Feb Mar Apr May Jun Jul Aug Sep Oct NovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 212223 24 25 26 27 28 29 30 31

When AI 10x’s AI R&D, What Do We Do?

Logan Riggs21 Dec 2024 23:56 UTC

72 points

17 comments4 min readLW link

AI as systems, not just models

Andy Arditi21 Dec 2024 23:19 UTC

29 points

0 comments7 min readLW link

(andyrdt.com)

Towards a Unified Interpretability of Artificial and Biological Neural Networks

Jan Bauer21 Dec 2024 23:10 UTC

2 points

0 comments1 min readLW link

Robbin’s Farm Sledding Route

jefftk21 Dec 2024 22:10 UTC

13 points

1 comment1 min readLW link

(www.jefftk.com)

AGI with RL is Bad News for Safety

Nadav Brandes21 Dec 2024 19:36 UTC

19 points

22 comments2 min readLW link

Better difference-making views

MichaelStJules21 Dec 2024 18:27 UTC

9 points

0 comments14 min readLW link

Review: Good Strategy, Bad Strategy

L Rudolf L21 Dec 2024 17:17 UTC

44 points

0 comments23 min readLW link

(nosetgauge.substack.com)

Last Line of Defense: Minimum Viable Shelters for Mirror Bacteria

Ulrik Horn21 Dec 2024 8:28 UTC

16 points

26 comments21 min readLW link

Elon Musk and Solar Futurism

transhumanist_atom_understander21 Dec 2024 2:55 UTC

33 points

27 comments5 min readLW link

Good Reasons for Alts

jefftk21 Dec 2024 1:30 UTC

24 points

2 comments1 min readLW link

(www.jefftk.com)

Updating on Bad Arguments

Guive21 Dec 2024 1:19 UTC

11 points

2 comments2 min readLW link

(guive.substack.com)

Bird’s eye view: An interactive representation to see large collection of text “from above”.

Alexandre Variengien21 Dec 2024 0:15 UTC

12 points

4 comments5 min readLW link

(alexandrevariengien.com)

The nihilism of NeurIPS

charlieoneill20 Dec 2024 23:58 UTC

108 points

6 comments4 min readLW link

Forecast 2025 With Vox’s Future Perfect Team — $2,500 Prize Pool

ChristianWilliams20 Dec 2024 23:00 UTC

19 points

0 comments1 min readLW link

(www.metaculus.com)

[Question] How do we quantify non-philanthropic contributions from Buffet and Soros?

Philosophistry20 Dec 2024 22:50 UTC

3 points

0 comments1 min readLW link

Anthropic leadership conversation

Zach Stein-Perlman20 Dec 2024 22:00 UTC

69 points

17 comments6 min readLW link

(www.youtube.com)

As We May Align

Gilbert C20 Dec 2024 19:02 UTC

−1 points

0 comments6 min readLW link

o3 is not being released to the public. First they are only giving access to external safety testers. You can apply to get early access to do safety testing

KatWoods20 Dec 2024 18:30 UTC

16 points

0 comments1 min readLW link

(openai.com)

o3

Zach Stein-Perlman20 Dec 2024 18:30 UTC

154 points

164 comments1 min readLW link

What Goes Without Saying

sarahconstantin20 Dec 2024 18:00 UTC

356 points

29 comments5 min readLW link 1 review

(sarahconstantin.substack.com)

Retrospective: PIBBSS Fellowship 2024

DusanDNesic, clem_acs and Lucas Teixeira

20 Dec 2024 15:55 UTC

64 points

1 comment4 min readLW link

Compositionality and Ambiguity: Latent Co-occurrence and Interpretable Subspaces

Matthew A. Clarke, hrdkbhatnagar and Joseph Bloom

20 Dec 2024 15:16 UTC

36 points

0 comments37 min readLW link

🇫🇷 Announcing CeSIA: The French Center for AI Safety

Charbel-Raphaël20 Dec 2024 14:17 UTC

102 points

2 comments8 min readLW link

Moderately Skeptical of “Risks of Mirror Biology”

Davidmanheim20 Dec 2024 12:57 UTC

31 points

3 comments9 min readLW link

(substack.com)

Doing Sport Reliably via Dancing

Johannes C. Mayer20 Dec 2024 12:06 UTC

16 points

0 comments2 min readLW link

You can validly be seen and validated by a chatbot

Kaj_Sotala20 Dec 2024 12:00 UTC

30 points

3 comments8 min readLW link

(kajsotala.fi)

What I expected from this site: A LessWrong review

Nathan Young20 Dec 2024 11:27 UTC

31 points

5 comments3 min readLW link

(nathanpmyoung.substack.com)

Algophobes and Algoverses: The New Enemies of Progress

Wenitte Apiou20 Dec 2024 10:01 UTC

−24 points

0 comments2 min readLW link

“Alignment Faking” frame is somewhat fake

Jan_Kulveit20 Dec 2024 9:51 UTC

169 points

16 comments6 min readLW link 1 review

No Internally-Crispy Mac and Cheese

jefftk20 Dec 2024 3:20 UTC

12 points

5 comments1 min readLW link

(www.jefftk.com)

Apply to be a TA for TARA

yanni kyriacos20 Dec 2024 2:25 UTC

10 points

0 comments1 min readLW link

Announcing the Q1 2025 Long-Term Future Fund grant round

Linch, habryka and calebp99

20 Dec 2024 2:20 UTC

36 points

2 comments2 min readLW link

(forum.effectivealtruism.org)

Reminder: AI Safety is Also a Behavioral Economics Problem

zoop20 Dec 2024 1:40 UTC

2 points

0 comments1 min readLW link

Replaceable Axioms give more credence than irreplaceable axioms

Yoav Ravid20 Dec 2024 0:51 UTC

14 points

9 comments2 min readLW link 1 review

Mid-Generation Self-Correction: A Simple Tool for Safer AI

MrThink19 Dec 2024 23:41 UTC

13 points

0 comments1 min readLW link

Apply now to SPAR!

agucova19 Dec 2024 22:29 UTC

11 points

0 comments1 min readLW link

How to replicate and extend our alignment faking demo

Fabien Roger19 Dec 2024 21:44 UTC

114 points

5 comments2 min readLW link

(alignment.anthropic.com)

The Genesis Project

mannatvjain19 Dec 2024 21:26 UTC

15 points

0 comments1 min readLW link

(genesis-embodied-ai.github.io)

Measuring whether AIs can statelessly strategize to subvert security measures

Alex Mallen and Buck

19 Dec 2024 21:25 UTC

65 points

0 comments11 min readLW link

Claude’s Constitutional Consequentialism?

1a3orn19 Dec 2024 19:53 UTC

44 points

6 comments6 min readLW link

A short critique of Omohundro’s “Basic AI Drives”

Soumyadeep Bose19 Dec 2024 19:19 UTC

6 points

0 comments4 min readLW link

When Is Insurance Worth It?

kqr19 Dec 2024 19:07 UTC

182 points

73 comments4 min readLW link 1 review

(entropicthoughts.com)

Launching Third Opinion: Anonymous Expert Consultation for AI Professionals

karl19 Dec 2024 19:06 UTC

3 points

0 comments5 min readLW link

Using LLM Search to Augment (Mathematics) Research

kaleb19 Dec 2024 18:59 UTC

5 points

0 comments6 min readLW link

A progress policy agenda

jasoncrawford19 Dec 2024 18:42 UTC

31 points

1 comment5 min readLW link

(newsletter.rootsofprogress.org)

building character isn’t about willpower or sacrifice

dhruvmethi19 Dec 2024 18:17 UTC

1 point

0 comments4 min readLW link

AISN #45: Center for AI Safety 2024 Year in Review

Corin Katzke and Dan H

19 Dec 2024 18:15 UTC

13 points

0 comments4 min readLW link

(newsletter.safe.ai)

Learning Multi-Level Features with Matryoshka SAEs

Bart Bussmann, Patrick Leask and Neel Nanda

19 Dec 2024 15:59 UTC

46 points

6 comments11 min readLW link

Simple Steganographic Computation Eval—gpt-4o and gemini-exp-1206 can’t solve it yet

Filip Sondej19 Dec 2024 15:47 UTC

13 points

2 comments3 min readLW link

AI #95: o1 Joins the API

Zvi19 Dec 2024 15:10 UTC

58 points

1 comment41 min readLW link

(thezvi.wordpress.com)