OpenAI Alums, No­bel Lau­re­ates Urge Reg­u­la­tors to Save Com­pany’s Non­profit Structure

garrison23 Apr 2025 23:01 UTC
66 points
0 comments8 min readLW link
(garrisonlovely.substack.com)

What AI safety plans are there?

MichaelDickens23 Apr 2025 22:58 UTC
16 points
3 comments1 min readLW link

o3 Is a Ly­ing Liar

Zvi23 Apr 2025 20:00 UTC
84 points
26 comments9 min readLW link
(thezvi.wordpress.com)

Put­ting up Bumpers

Sam Bowman23 Apr 2025 16:05 UTC
54 points
14 comments2 min readLW link

The AI Belief-Con­sis­tency Letter

Knight Lee23 Apr 2025 12:01 UTC
−6 points
15 comments4 min readLW link

Jaan Tal­linn’s 2024 Philan­thropy Overview

jaan23 Apr 2025 11:06 UTC
227 points
8 comments1 min readLW link
(jaan.info)

[Question] Are we “be­ing poi­soned”?

Tigerlily23 Apr 2025 5:11 UTC
16 points
2 comments2 min readLW link

To Un­der­stand His­tory, Keep Former Pop­u­la­tion Distri­bu­tions In Mind

Arjun Panickssery23 Apr 2025 4:51 UTC
240 points
13 comments2 min readLW link
(arjunpanickssery.substack.com)

Fish and Faces

Eggs23 Apr 2025 3:35 UTC
8 points
6 comments2 min readLW link

Is al­ign­ment re­ducible to be­com­ing more co­her­ent?

Cole Wyeth22 Apr 2025 23:47 UTC
19 points
0 comments3 min readLW link

The EU Is Ask­ing for Feed­back on Fron­tier AI Reg­u­la­tion (Open to Global Ex­perts)—This Post Breaks Down What’s at Stake for AI Safety

Katalina Hernandez22 Apr 2025 20:39 UTC
62 points
13 comments9 min readLW link

Cor­rupted by Rea­son­ing: Rea­son­ing Lan­guage Models Be­come Free-Riders in Public Goods Games

22 Apr 2025 19:25 UTC
24 points
3 comments5 min readLW link

Align­ment from equiv­ar­i­ance II—lan­guage equiv­ar­i­ance as a way of figur­ing out what an AI “means”

hamishtodd122 Apr 2025 19:04 UTC
5 points
0 comments3 min readLW link

There is no Red Line

Tachikoma22 Apr 2025 18:28 UTC
−13 points
1 comment3 min readLW link

Man­i­fund 2025 Regrants

Austin Chen22 Apr 2025 17:36 UTC
21 points
0 comments5 min readLW link
(manifund.substack.com)

AISN#52: An Ex­pert Virol­ogy Benchmark

22 Apr 2025 17:08 UTC
6 points
0 comments4 min readLW link
(newsletter.safe.ai)

In­tu­ition in AI

Priyanka Bharadwaj22 Apr 2025 15:15 UTC
−1 points
2 comments2 min readLW link

Prob­lems with Bayesi­anism: A So­cratic Dialogue

B Jacobs22 Apr 2025 14:09 UTC
3 points
1 comment14 min readLW link
(bobjacobs.substack.com)

So­cietal and tech­nolog­i­cal progress as sewing an ever-grow­ing, ever-chang­ing, patchy, and poly­chrome quilt

22 Apr 2025 13:21 UTC
47 points
24 comments25 min readLW link

You Bet­ter Mechanize

Zvi22 Apr 2025 13:10 UTC
76 points
6 comments20 min readLW link
(thezvi.wordpress.com)

Ex­per­i­men­tal test­ing: can I treat my­self as a ran­dom sam­ple?

avturchin22 Apr 2025 12:34 UTC
9 points
41 comments4 min readLW link

Fam­ily-line se­lec­tion optimizer

lemonhope22 Apr 2025 7:16 UTC
2 points
0 comments1 min readLW link

Ac­countabil­ity Sinks

Martin Sustrik22 Apr 2025 5:00 UTC
440 points
57 comments15 min readLW link
(250bpm.substack.com)

Most AI value will come from broad au­toma­tion, not from R&D

Matthew Barnett22 Apr 2025 3:22 UTC
10 points
6 comments2 min readLW link
(epoch.ai)

Es­ti­mat (8 La­tent Values)

P. João22 Apr 2025 2:42 UTC
4 points
0 comments3 min readLW link

A Let­ter to His High­ness Louis XV, the King of France

testingthewaters22 Apr 2025 0:51 UTC
2 points
0 comments1 min readLW link
(aclevername.substack.com)

10 Prin­ci­ples for Real Align­ment

Adriaan21 Apr 2025 22:18 UTC
−7 points
0 comments7 min readLW link

AE Stu­dio is hiring!

Trent Hodgeson21 Apr 2025 20:35 UTC
20 points
2 comments2 min readLW link

$500 Bounty Prob­lem: Are (Ap­prox­i­mately) Deter­minis­tic Nat­u­ral La­tents All You Need?

21 Apr 2025 20:19 UTC
92 points
24 comments3 min readLW link

More Than Just A, T, C, and G: Screen­ing for Hid­den Dangers in DNA Sequences

sgd21 Apr 2025 20:12 UTC
1 point
0 comments11 min readLW link

The US Ex­ec­u­tive vs Supreme Court De­por­ta­tions Clash

NunoSempere21 Apr 2025 19:56 UTC
44 points
12 comments7 min readLW link
(blog.sentinel-team.org)

Pod­cast on “AI tools for ex­is­ten­tial se­cu­rity” — transcript

21 Apr 2025 19:26 UTC
11 points
0 comments43 min readLW link
(pnc.st)

Im­pli­ca­tions for the like­li­hood of hu­man ex­tinc­tion from the re­cent dis­cov­ery of pos­si­ble micro­bial life

Mvolz21 Apr 2025 19:15 UTC
1 point
2 comments1 min readLW link

Key event tracker for AI2027

MarkelKori21 Apr 2025 19:02 UTC
1 point
0 comments1 min readLW link

Load Bear­ing Magic

winstonBosan21 Apr 2025 18:53 UTC
8 points
2 comments3 min readLW link

The Uses of Complacency

sarahconstantin21 Apr 2025 18:50 UTC
88 points
5 comments8 min readLW link
(sarahconstantin.substack.com)

Fea­ture-Based Anal­y­sis of Safety-Rele­vant Multi-Agent Behavior

21 Apr 2025 18:12 UTC
10 points
0 comments5 min readLW link

Crime and Pu­n­ish­ment #1

Zvi21 Apr 2025 15:30 UTC
39 points
10 comments39 min readLW link
(thezvi.wordpress.com)

Im­prov­ing CNNs with Klein Net­works: A Topolog­i­cal Ap­proach to AI

Gunnar Carlsson21 Apr 2025 15:21 UTC
18 points
4 comments5 min readLW link

Eu­logy to the Obits

21 Apr 2025 14:10 UTC
5 points
1 comment10 min readLW link

Re­search Notes: Run­ning Claude 3.7, Gem­ini 2.5 Pro, and o3 on Poké­mon Red

Julian Bradshaw21 Apr 2025 3:52 UTC
123 points
20 comments14 min readLW link

Not All Beliefs Are Created Equal: Di­ag­nos­ing Toxic Ideologies

Big_friendly_kiwi21 Apr 2025 3:18 UTC
23 points
7 comments9 min readLW link

AI 2027 is a Bet Against Am­dahl’s Law

snewman21 Apr 2025 3:09 UTC
126 points
56 comments9 min readLW link

Sev­er­ance and the Ethics of the Con­scious Agents

Crissman21 Apr 2025 2:21 UTC
4 points
0 comments1 min readLW link

March-April 2025 Progress in Guaran­teed Safe AI

Quinn20 Apr 2025 19:00 UTC
6 points
0 comments4 min readLW link
(gsai.substack.com)

How to end credentialism

Yair Halberstadt20 Apr 2025 18:50 UTC
13 points
15 comments8 min readLW link

Spend­ing on Ourselves

jefftk20 Apr 2025 18:40 UTC
23 points
0 comments3 min readLW link
(www.jefftk.com)

In­ter­est­ing ACX 2024 Book Re­view Entries

jenn20 Apr 2025 18:10 UTC
24 points
1 comment4 min readLW link

[Question] To what ethics is an AGI ac­tu­ally safely al­ignable?

StanislavKrym20 Apr 2025 17:09 UTC
1 point
6 comments4 min readLW link

Eval­u­at­ing Over­sight Ro­bust­ness with In­cen­tivized Re­ward Hacking

20 Apr 2025 16:53 UTC
7 points
2 comments15 min readLW link