When AI 10x’s AI R&D, What Do We Do?

Logan RiggsDec 21, 2024, 11:56 PM
72 points
16 comments4 min readLW link

AI as sys­tems, not just models

Andy ArditiDec 21, 2024, 11:19 PM
28 points
0 comments7 min readLW link
(andyrdt.com)

Towards a Unified In­ter­pretabil­ity of Ar­tifi­cial and Biolog­i­cal Neu­ral Networks

jan_bauerDec 21, 2024, 11:10 PM
2 points
0 comments1 min readLW link

Rob­bin’s Farm Sled­ding Route

jefftkDec 21, 2024, 10:10 PM
13 points
1 comment1 min readLW link
(www.jefftk.com)

AGI with RL is Bad News for Safety

Nadav BrandesDec 21, 2024, 7:36 PM
19 points
22 comments2 min readLW link

Bet­ter differ­ence-mak­ing views

MichaelStJulesDec 21, 2024, 6:27 PM
7 points
0 commentsLW link

Re­view: Good Strat­egy, Bad Strategy

L Rudolf LDec 21, 2024, 5:17 PM
43 points
0 comments23 min readLW link
(nosetgauge.substack.com)

Last Line of Defense: Min­i­mum Vi­able Shelters for Mir­ror Bacteria

Ulrik HornDec 21, 2024, 8:28 AM
12 points
26 comments21 min readLW link

Elon Musk and So­lar Futurism

transhumanist_atom_understanderDec 21, 2024, 2:55 AM
32 points
27 comments5 min readLW link

Good Rea­sons for Alts

jefftkDec 21, 2024, 1:30 AM
24 points
2 comments1 min readLW link
(www.jefftk.com)

Up­dat­ing on Bad Arguments

GuiveDec 21, 2024, 1:19 AM
11 points
2 comments2 min readLW link
(guive.substack.com)

Bird’s eye view: An in­ter­ac­tive rep­re­sen­ta­tion to see large col­lec­tion of text “from above”.

Alexandre VariengienDec 21, 2024, 12:15 AM
10 points
4 comments5 min readLW link
(alexandrevariengien.com)

The nihilism of NeurIPS

charlieoneillDec 20, 2024, 11:58 PM
107 points
6 comments4 min readLW link

Fore­cast 2025 With Vox’s Fu­ture Perfect Team — $2,500 Prize Pool

ChristianWilliamsDec 20, 2024, 11:00 PM
19 points
0 commentsLW link
(www.metaculus.com)

[Question] How do we quan­tify non-philan­thropic con­tri­bu­tions from Buffet and Soros?

PhilosophistryDec 20, 2024, 10:50 PM
3 points
0 comments1 min readLW link

An­thropic lead­er­ship conversation

Zach Stein-PerlmanDec 20, 2024, 10:00 PM
67 points
17 comments6 min readLW link
(www.youtube.com)

As We May Align

Gilbert CDec 20, 2024, 7:02 PM
−1 points
0 comments6 min readLW link

o3 is not be­ing re­leased to the pub­lic. First they are only giv­ing ac­cess to ex­ter­nal safety testers. You can ap­ply to get early ac­cess to do safety testing

KatWoodsDec 20, 2024, 6:30 PM
16 points
0 comments1 min readLW link
(openai.com)

o3

Zach Stein-PerlmanDec 20, 2024, 6:30 PM
154 points
164 comments1 min readLW link

What Goes Without Saying

sarahconstantinDec 20, 2024, 6:00 PM
334 points
28 comments5 min readLW link
(sarahconstantin.substack.com)

Ret­ro­spec­tive: PIBBSS Fel­low­ship 2024

Dec 20, 2024, 3:55 PM
64 points
1 comment4 min readLW link

Com­po­si­tion­al­ity and Am­bi­guity: La­tent Co-oc­cur­rence and In­ter­pretable Subspaces

Dec 20, 2024, 3:16 PM
32 points
0 comments37 min readLW link

🇫🇷 An­nounc­ing CeSIA: The French Cen­ter for AI Safety

Charbel-RaphaëlDec 20, 2024, 2:17 PM
90 points
2 comments8 min readLW link

Moder­ately Skep­ti­cal of “Risks of Mir­ror Biol­ogy”

DavidmanheimDec 20, 2024, 12:57 PM
31 points
3 comments9 min readLW link
(substack.com)

Do­ing Sport Reli­ably via Dancing

Johannes C. MayerDec 20, 2024, 12:06 PM
16 points
0 comments2 min readLW link

You can val­idly be seen and val­i­dated by a chatbot

Kaj_SotalaDec 20, 2024, 12:00 PM
30 points
3 comments8 min readLW link
(kajsotala.fi)

What I ex­pected from this site: A LessWrong review

Nathan YoungDec 20, 2024, 11:27 AM
31 points
5 comments3 min readLW link
(nathanpmyoung.substack.com)

Al­go­phobes and Al­go­v­er­ses: The New Ene­mies of Progress

Wenitte ApiouDec 20, 2024, 10:01 AM
−24 points
0 comments2 min readLW link

“Align­ment Fak­ing” frame is some­what fake

Jan_KulveitDec 20, 2024, 9:51 AM
153 points
13 comments6 min readLW link

No In­ter­nally-Crispy Mac and Cheese

jefftkDec 20, 2024, 3:20 AM
12 points
5 comments1 min readLW link
(www.jefftk.com)

Ap­ply to be a TA for TARA

yanni kyriacosDec 20, 2024, 2:25 AM
10 points
0 comments1 min readLW link

An­nounc­ing the Q1 2025 Long-Term Fu­ture Fund grant round

Dec 20, 2024, 2:20 AM
36 points
2 comments2 min readLW link
(forum.effectivealtruism.org)

Re­minder: AI Safety is Also a Be­hav­ioral Eco­nomics Problem

zoopDec 20, 2024, 1:40 AM
2 points
0 comments1 min readLW link

Re­place­able Ax­ioms give more cre­dence than ir­re­place­able axioms

Yoav RavidDec 20, 2024, 12:51 AM
6 points
2 comments2 min readLW link

Mid-Gen­er­a­tion Self-Cor­rec­tion: A Sim­ple Tool for Safer AI

MrThinkDec 19, 2024, 11:41 PM
13 points
0 comments1 min readLW link

Ap­ply now to SPAR!

agucovaDec 19, 2024, 10:29 PM
11 points
0 commentsLW link

How to repli­cate and ex­tend our al­ign­ment fak­ing demo

Fabien RogerDec 19, 2024, 9:44 PM
114 points
5 comments2 min readLW link
(alignment.anthropic.com)

The Ge­n­e­sis Project

aproteinengineDec 19, 2024, 9:26 PM
15 points
0 comments1 min readLW link
(genesis-embodied-ai.github.io)

Mea­sur­ing whether AIs can state­lessly strate­gize to sub­vert se­cu­rity measures

Dec 19, 2024, 9:25 PM
62 points
0 comments11 min readLW link

Claude’s Con­sti­tu­tional Con­se­quen­tial­ism?

1a3ornDec 19, 2024, 7:53 PM
43 points
6 comments6 min readLW link

A short cri­tique of Omo­hun­dro’s “Ba­sic AI Drives”

Soumyadeep BoseDec 19, 2024, 7:19 PM
6 points
0 comments4 min readLW link

When Is In­surance Worth It?

kqrDec 19, 2024, 7:07 PM
175 points
71 comments4 min readLW link
(entropicthoughts.com)

Launch­ing Third Opinion: Anony­mous Ex­pert Con­sul­ta­tion for AI Professionals

karlDec 19, 2024, 7:06 PM
3 points
0 comments5 min readLW link

Us­ing LLM Search to Aug­ment (Math­e­mat­ics) Research

kalebDec 19, 2024, 6:59 PM
5 points
0 comments6 min readLW link

A progress policy agenda

jasoncrawfordDec 19, 2024, 6:42 PM
31 points
1 comment5 min readLW link
(newsletter.rootsofprogress.org)

build­ing char­ac­ter isn’t about willpower or sacrifice

dhruvmethiDec 19, 2024, 6:17 PM
1 point
0 comments4 min readLW link

AISN #45: Cen­ter for AI Safety 2024 Year in Review

Dec 19, 2024, 6:15 PM
13 points
0 comments4 min readLW link
(newsletter.safe.ai)

Learn­ing Multi-Level Fea­tures with Ma­tryoshka SAEs

Dec 19, 2024, 3:59 PM
42 points
6 comments11 min readLW link

Sim­ple Stegano­graphic Com­pu­ta­tion Eval—gpt-4o and gem­ini-exp-1206 can’t solve it yet

Filip SondejDec 19, 2024, 3:47 PM
13 points
2 comments3 min readLW link

AI #95: o1 Joins the API

ZviDec 19, 2024, 3:10 PM
58 points
1 comment41 min readLW link
(thezvi.wordpress.com)