How Can Aver­age Peo­ple Con­tribute to AI Safety?

Stephen McAleeseMar 6, 2025, 10:50 PM
16 points
4 comments8 min readLW link

An­thropic’s Recom­men­da­tions to OSTP for the U.S. AI Ac­tion Plan

UnofficialLinkpostBotMar 6, 2025, 10:38 PM
11 points
2 comments2 min readLW link
(www.anthropic.com)

Lots of brief thoughts on Soft­ware Engineering

Yair HalberstadtMar 6, 2025, 7:50 PM
47 points
17 comments10 min readLW link

What the Head­lines Miss About the Lat­est De­ci­sion in the Musk vs. OpenAI Lawsuit

garrisonMar 6, 2025, 7:49 PM
98 points
0 commentsLW link
(garrisonlovely.substack.com)

The op­ti­mizer won’t just guess your in­tended semantics

Thomas KehrenbergMar 6, 2025, 7:42 PM
20 points
1 comment6 min readLW link

AISN #49: Su­per­in­tel­li­gence Strategy

Mar 6, 2025, 5:46 PM
6 points
1 comment5 min readLW link
(newsletter.safe.ai)

De­ci­sion-Rele­vance of wor­lds and ADT im­ple­men­ta­tions

Maxime RichéMar 6, 2025, 4:57 PM
9 points
0 comments15 min readLW link

AI #106: Not so Fast

ZviMar 6, 2025, 3:40 PM
34 points
5 comments38 min readLW link
(thezvi.wordpress.com)

Can a finite phys­i­cal de­vice be Tur­ing equiv­a­lent?

Noosphere89Mar 6, 2025, 3:02 PM
0 points
10 comments2 min readLW link
(lifeiscomputation.com)

We should start look­ing for schem­ing “in the wild”

Marius HobbhahnMar 6, 2025, 1:49 PM
89 points
4 comments5 min readLW link

Bounded AI might be viable

Mar 6, 2025, 12:55 PM
24 points
4 comments20 min readLW link

Pub­lish your ge­nomic data

samuelshadrachMar 6, 2025, 12:39 PM
1 point
0 comments1 min readLW link

Which meat to eat: CO₂ vs An­i­mal suffering

B JacobsMar 6, 2025, 12:37 PM
2 points
2 comments3 min readLW link
(bobjacobs.substack.com)

Mus­ings on Sce­nario Fore­cast­ing and AI

Alvin ÅnestrandMar 6, 2025, 12:28 PM
10 points
0 comments11 min readLW link
(forecastingaifutures.substack.com)

Minor in­ter­pretabil­ity ex­plo­ra­tion #2: Ex­tend­ing su­per­po­si­tion to differ­ent ac­ti­va­tion functions

Rareș BaronMar 6, 2025, 11:22 AM
1 point
0 comments4 min readLW link

What is Lock-In?

alamertonMar 6, 2025, 11:09 AM
5 points
0 comments9 min readLW link

ASI Game The­ory: The Cos­mic Dark For­est Deterrent

tavurthMar 6, 2025, 10:28 AM
1 point
4 comments1 min readLW link

The Hid­den Cost of Our Lies to AI

Nicholas AndresenMar 6, 2025, 5:03 AM
144 points
18 comments7 min readLW link
(substack.com)

Camps Should List Bands

jefftkMar 6, 2025, 3:00 AM
7 points
0 comments1 min readLW link
(www.jefftk.com)

Give Neo a Chance

ankMar 6, 2025, 1:48 AM
3 points
7 comments7 min readLW link

[Question] Sparks of Origi­nal Thought?

AnnapurnaMar 6, 2025, 12:53 AM
6 points
4 comments1 min readLW link

So­cial Dilem­mas — pub­lic goods, free rid­ers, and exploitation

James Stephen BrownMar 5, 2025, 11:31 PM
6 points
0 comments3 min readLW link
(nonzerosum.games)

In­tro­duc­ing MASK: A Bench­mark for Mea­sur­ing Hon­esty in AI Systems

Mar 5, 2025, 10:56 PM
35 points
5 comments2 min readLW link
(www.mask-benchmark.ai)

The Hard­ware-Soft­ware Frame­work: A New Per­spec­tive on Eco­nomic Growth with AI

Jakub GrowiecMar 5, 2025, 7:59 PM
3 points
0 comments3 min readLW link

NY State Has a New Fron­tier Model Bill (+quick takes)

henryjMar 5, 2025, 7:29 PM
9 points
0 comments1 min readLW link
(www.henryjosephson.com)

The old mem­o­ries tree

Yair HalberstadtMar 5, 2025, 7:03 PM
7 points
1 comment1 min readLW link

Re­ply to Vi­talik on d/​acc

samuelshadrachMar 5, 2025, 6:55 PM
8 points
0 comments3 min readLW link
(samuelshadrach.com)

A Bear Case: My Pre­dic­tions Re­gard­ing AI Progress

Thane RuthenisMar 5, 2025, 4:41 PM
362 points
157 comments9 min readLW link

On the Ra­tion­al­ity of Deter­ring ASI

Dan HMar 5, 2025, 4:11 PM
166 points
34 comments4 min readLW link
(nationalsecurity.ai)

On OpenAI’s Safety and Align­ment Philosophy

ZviMar 5, 2025, 2:00 PM
58 points
5 comments17 min readLW link
(thezvi.wordpress.com)

The Align­ment Im­per­a­tive: Act Now or Lose Every­thing

racinkc1Mar 5, 2025, 5:49 AM
−14 points
0 comments1 min readLW link

Con­tra Dance Pay and Inflation

jefftkMar 5, 2025, 2:40 AM
12 points
0 comments2 min readLW link
(www.jefftk.com)

*NYT Op-Ed* The Govern­ment Knows A.G.I. Is Coming

worseMar 5, 2025, 1:53 AM
11 points
12 comments2 min readLW link
(www.nytimes.com)

Could this be an un­usu­ally good time to Earn To Give?

TomGardinerMar 4, 2025, 9:51 PM
−1 points
0 comments3 min readLW link
(forum.effectivealtruism.org)

What is the best /​ most proper defi­ni­tion of “Feel­ing the AGI” there is?

AnnapurnaMar 4, 2025, 8:13 PM
8 points
5 comments1 min readLW link

En­ergy Mar­kets Tem­po­ral Ar­bi­trage with Batteries

NickyPMar 4, 2025, 5:37 PM
21 points
3 comments16 min readLW link

Distil­la­tion of Meta’s Large Con­cept Models Paper

NickyPMar 4, 2025, 5:33 PM
19 points
3 comments4 min readLW link

Top AI safety newslet­ters, books, pod­casts, etc – new AISafety.com resource

Mar 4, 2025, 5:01 PM
32 points
2 comments1 min readLW link

2028 Should Not Be AI Safety’s First Fo­ray Into Politics

Jesse RichardsonMar 4, 2025, 4:46 PM
5 points
0 comments2 min readLW link

[Question] How Much Are LLMs Ac­tu­ally Boost­ing Real-World Pro­gram­mer Pro­duc­tivity?

Thane RuthenisMar 4, 2025, 4:23 PM
137 points
52 comments3 min readLW link

Val­i­dat­ing against a mis­al­ign­ment de­tec­tor is very differ­ent to train­ing against one

mattmacdermottMar 4, 2025, 3:41 PM
33 points
4 comments4 min readLW link

For schem­ing, we should first fo­cus on de­tec­tion and then on prevention

Marius HobbhahnMar 4, 2025, 3:22 PM
47 points
7 comments5 min readLW link

Progress links and short notes, 2025-03-03

jasoncrawfordMar 4, 2025, 3:20 PM
8 points
0 comments6 min readLW link
(newsletter.rootsofprogress.org)

For­ma­tion Re­search: Or­gani­sa­tion Overview

alamertonMar 4, 2025, 3:03 PM
5 points
0 comments11 min readLW link

On Writ­ing #1

ZviMar 4, 2025, 1:30 PM
37 points
2 comments15 min readLW link
(thezvi.wordpress.com)

The Semi-Ra­tional Mili­tar Firefighter

P. JoãoMar 4, 2025, 12:23 PM
72 points
10 comments2 min readLW link

Ob­ser­va­tions About LLM In­fer­ence Pricing

Aaron_ScherMar 4, 2025, 3:03 AM
28 points
2 comments9 min readLW link
(techgov.intelligence.org)

[Question] How much should I worry about the At­lanta Fed’s GDP es­ti­mates?

Brendan LongMar 4, 2025, 2:03 AM
16 points
2 comments1 min readLW link

[Question] shouldn’t we try to get me­dia at­ten­tion?

KvmanThinkingMar 4, 2025, 1:39 AM
6 points
1 comment1 min readLW link

The Mil­ton Fried­man Model of Policy Change

JohnofCharlestonMar 4, 2025, 12:38 AM
136 points
17 comments4 min readLW link