Can Rea­son­ing Models Avoid the Most For­bid­den Tech­nique?

Brendan Long17 May 2025 23:26 UTC
9 points
9 comments3 min readLW link
(www.brendanlong.com)

What OpenAI Told Cal­ifor­nia’s At­tor­ney General

garrison17 May 2025 23:14 UTC
108 points
3 comments8 min readLW link
(www.obsolete.pub)

Mul­tipo­lar AI is Underrated

Allison Duettmann17 May 2025 22:03 UTC
19 points
1 comment16 min readLW link

[Question] Will we sur­vive if AI solves en­g­ineer­ing be­fore de­cep­tion?

Knight Lee17 May 2025 19:22 UTC
21 points
13 comments1 min readLW link

Seven ways to Im­prove the In­ter­nal Model Principle

Alfred Harwood17 May 2025 16:38 UTC
15 points
0 comments13 min readLW link

D&D.Sci: The Choos­ing Ones

abstractapplic17 May 2025 15:26 UTC
48 points
17 comments1 min readLW link

The ab­sent-minded variations

dr_s17 May 2025 6:57 UTC
24 points
13 comments9 min readLW link

Book Re­view: The Art of Happiness

Screwtape17 May 2025 4:56 UTC
37 points
23 comments11 min readLW link

Man­age­ment is the Near Future

jefftk17 May 2025 2:50 UTC
53 points
10 comments2 min readLW link
(www.jefftk.com)

Proof Sec­tion to an In­tro­duc­tion to Re­in­force­ment Learn­ing for Un­der­stand­ing In­fra-Bayesianism

Brittany Gelb17 May 2025 2:36 UTC
3 points
0 comments9 min readLW link

An In­tro­duc­tion to Re­in­force­ment Learn­ing for Un­der­stand­ing In­fra-Bayesianism

Brittany Gelb17 May 2025 2:34 UTC
21 points
0 comments20 min readLW link

Me­mory De­cod­ing Jour­nal Club: “Sy­nap­tic ar­chi­tec­ture of a mem­ory en­gram in the mouse hip­pocam­pus.”

Devin Ward16 May 2025 23:55 UTC
3 points
0 comments1 min readLW link

So­cial Anx­iety Isn’t About Be­ing Liked

Chris Lakin16 May 2025 22:26 UTC
145 points
21 comments2 min readLW link
(chrislakin.blog)

Events: De­bate & Fic­tion Project

abramdemski16 May 2025 21:51 UTC
39 points
1 comment1 min readLW link

How Fast Can Al­gorithms Ad­vance Ca­pa­bil­ities? | Epoch Gra­di­ent Update

henryj16 May 2025 21:38 UTC
39 points
10 comments6 min readLW link
(epoch.ai)

P-Values Know When You’re Cheating

Eggs16 May 2025 20:34 UTC
21 points
2 comments2 min readLW link

Minds are magic

k6416 May 2025 19:10 UTC
0 points
1 comment2 min readLW link

US-China trade talks should pave way for AI safety treaty [SCMP cross­post]

otto.barten16 May 2025 16:55 UTC
10 points
0 comments3 min readLW link

Direct Real­ism is prob­a­bly false

TerriLeaf16 May 2025 16:36 UTC
−3 points
19 comments3 min readLW link

Re­gard­ing South Africa

Zvi16 May 2025 16:10 UTC
71 points
5 comments11 min readLW link
(thezvi.wordpress.com)

Notes on Consciousness

CSDD16 May 2025 14:17 UTC
3 points
3 comments1 min readLW link

re­flect­ing on criticism

Vadim Golub16 May 2025 11:59 UTC
4 points
5 comments10 min readLW link

Gen­er­at­ing the Fun­niest Joke with RL (ac­cord­ing to GPT-4.1)

agg16 May 2025 5:09 UTC
103 points
22 comments4 min readLW link

In­ter­pretable Fine Tun­ing Re­search Up­date and Work­ing Prototype

Matthew Khoriaty16 May 2025 3:44 UTC
10 points
0 comments4 min readLW link

It Is Un­ten­able That Near-Fu­ture AI Sce­nario Models Like “AI 2027” Don’t In­clude Open Source AI

Andrew Dickson16 May 2025 2:20 UTC
37 points
17 comments5 min readLW link

Ap­ply to Visit­ing Fel­lows at Con­stel­la­tion, due June 13

Ella Markianos16 May 2025 2:20 UTC
1 point
0 comments2 min readLW link

Para­noid Debating

DresdenHeart16 May 2025 2:20 UTC
1 point
0 comments1 min readLW link

Bay Area Sum­mer Solstice

16 May 2025 0:20 UTC
20 points
0 comments1 min readLW link

Stay­ing in a Cap­sule Hotel

jefftk16 May 2025 0:20 UTC
25 points
2 comments1 min readLW link
(www.jefftk.com)

Re­search­ing Syn­thetic Con­scious­ness: sound ap­peal­ing?

Brad Dunn15 May 2025 22:29 UTC
10 points
1 comment1 min readLW link

Start­ing Over: What to tell Sarah, at the edge of pro­fes­sional oblivion.

Brad Dunn15 May 2025 21:34 UTC
11 points
1 comment20 min readLW link

Tax-Op­ti­mized Risk in Port­fo­lio Allocation

Brendan Long15 May 2025 18:53 UTC
6 points
0 comments1 min readLW link
(www.brendanlong.com)

AI Safety Thurs­days: Un­der­stand­ing The Self-Other Over­lap Approach

Juliana Eberschlag15 May 2025 18:41 UTC
2 points
0 comments1 min readLW link

Some skep­ti­cism about skep­ti­cism about effi­cacy of paus­ing AI

extinction-bounties15 May 2025 18:15 UTC
5 points
1 comment2 min readLW link

time is event based

thiccythot15 May 2025 18:07 UTC
65 points
1 comment4 min readLW link

Con­sider Others’ Cost Tolerances

nomagicpill15 May 2025 17:43 UTC
24 points
2 comments4 min readLW link
(nomagicpill.github.io)

Prob­lems with in­struc­tion-fol­low­ing as an al­ign­ment target

Seth Herd15 May 2025 15:41 UTC
51 points
14 comments10 min readLW link

AI #116: If Any­one Builds It, Every­one Dies

Zvi15 May 2025 15:10 UTC
47 points
5 comments42 min readLW link
(thezvi.wordpress.com)

Counter-con­sid­er­a­tions on AI arms races

15 May 2025 14:54 UTC
23 points
0 comments18 min readLW link

AlphaEvolve

mannatvjain15 May 2025 14:14 UTC
29 points
0 comments5 min readLW link
(deepmind.google)

From Com­ments on Ac­countabil­ity Sinks

Martin Sustrik15 May 2025 10:20 UTC
15 points
2 comments7 min readLW link
(250bpm.substack.com)

What Does It Mean to “Write Like You Talk”?

Arjun Panickssery15 May 2025 9:49 UTC
68 points
8 comments5 min readLW link
(arjunpanickssery.substack.com)

What if Agent-4 breaks out?

Alvin Ånestrand15 May 2025 9:15 UTC
12 points
0 comments6 min readLW link

Me­mory De­cod­ing Jour­nal Club: Sy­nap­tic ar­chi­tec­ture of a mem­ory en­gram in the mouse hip­pocam­pus

Devin Ward15 May 2025 4:14 UTC
1 point
0 comments1 min readLW link

[Question] Why OpenAI pro­jects only $174B of rev­enue by 2030?

becausecurious15 May 2025 2:50 UTC
28 points
6 comments1 min readLW link

Elas­tomeric Fit­ting Session

jefftk15 May 2025 1:50 UTC
14 points
4 comments2 min readLW link
(www.jefftk.com)

Re SMTM: nega­tive feed­back on nega­tive feedback

Steven Byrnes14 May 2025 19:50 UTC
56 points
1 comment22 min readLW link

Cu­rate your space

Logan Kieller14 May 2025 19:35 UTC
23 points
0 comments3 min readLW link
(agenticconjectures.substack.com)

Eliezer and I wrote a book: If Any­one Builds It, Every­one Dies

So8res14 May 2025 19:00 UTC
648 points
114 comments2 min readLW link

Notes on Life

CSDD14 May 2025 18:46 UTC
−1 points
0 comments5 min readLW link