Why I Tran­si­tioned: A Case Study

Fiora Starlight1 Nov 2025 22:58 UTC
324 points
80 comments10 min readLW link

Eco­nomics and Trans­for­ma­tive AI (by Tom Cun­ning­ham)

reallyeli1 Nov 2025 22:42 UTC
19 points
0 comments1 min readLW link
(tecunningham.github.io)

De­ci­sion the­ory when you can’t make decisions

Nina Panickssery1 Nov 2025 22:36 UTC
11 points
25 comments7 min readLW link
(blog.ninapanickssery.com)

You’re always stressed, your mind is always busy, you never have enough time

mingyuan1 Nov 2025 22:07 UTC
226 points
6 comments3 min readLW link
(mingyuan.substack.com)

Re-rol­ling environment

Raemon1 Nov 2025 21:46 UTC
140 points
2 comments2 min readLW link

Why Is Print­ing So Bad?

johnswentworth1 Nov 2025 21:37 UTC
50 points
25 comments2 min readLW link

Some Good Mee­tups (2025 Q2)

jenn1 Nov 2025 18:28 UTC
11 points
0 comments6 min readLW link

[Question] Shouldn’t tak­ing over the world be eas­ier than re­cur­sively self-im­prov­ing, as an AI?

KvmanThinking1 Nov 2025 17:26 UTC
6 points
18 comments1 min readLW link

ACX At­lanta Novem­ber Meetup

Steve French1 Nov 2025 16:32 UTC
2 points
0 comments1 min readLW link

Seat­tle Sec­u­lar Sols­tice 2025 – Dec 20th

datawitch1 Nov 2025 16:03 UTC
10 points
0 comments2 min readLW link

Fermi Para­dox, Ethics and Astro­nom­i­cal waste

StanislavKrym1 Nov 2025 15:24 UTC
6 points
0 comments1 min readLW link

LLM-gen­er­ated text is not testimony

TsviBT1 Nov 2025 14:47 UTC
104 points
89 comments11 min readLW link

Ap­ply to the Co­op­er­a­tive AI PhD Fel­low­ship by Novem­ber 16th!

Lewis Hammond1 Nov 2025 12:15 UTC
7 points
0 comments1 min readLW link

Vac­ci­na­tion against ASI

dscft1 Nov 2025 10:58 UTC
−21 points
3 comments1 min readLW link

What’s So Good About En­der’s Game?

Joy21 Nov 2025 9:34 UTC
2 points
3 comments1 min readLW link

Au­to­mated Cir­cuit In­ter­pre­ta­tion via Probe Prompting

Giuseppe Birardi1 Nov 2025 7:57 UTC
18 points
0 comments27 min readLW link

Strat­egy-Steal­ing Ar­gu­ment Against AI Dealmaking

Cleo Nardo1 Nov 2025 4:39 UTC
17 points
3 comments2 min readLW link

Ev­i­dence on lan­guage model consciousness

dsj1 Nov 2025 4:01 UTC
19 points
0 comments2 min readLW link
(thedavidsj.substack.com)

Ask­ing AI What Writ­ing Ad­vice Paul Fus­sell Would Give

Taylor G. Lunt1 Nov 2025 3:37 UTC
7 points
2 comments8 min readLW link

Freewrit­ing in my head, and over­com­ing the “twinge of start­ing”

ParrotRobot1 Nov 2025 1:12 UTC
23 points
1 comment6 min readLW link

2025 NYC Sec­u­lar Sols­tice & East Coast Ra­tion­al­ist Megameetup

Screwtape1 Nov 2025 1:06 UTC
13 points
0 comments1 min readLW link

Su­pervillain Monologues Are Unrealistic

Algon31 Oct 2025 23:58 UTC
82 points
18 comments2 min readLW link

Se­cretly Loyal AIs: Threat Vec­tors and Miti­ga­tion Strategies

Dave Banerjee31 Oct 2025 23:31 UTC
8 points
0 comments19 min readLW link
(substack.com)

Ink with­out haven

Dentosal31 Oct 2025 22:50 UTC
4 points
0 comments2 min readLW link

Ap­ply to the Cam­bridge ERA:AI Win­ter 2026 Fellowship

Kyle O’Brien31 Oct 2025 22:26 UTC
5 points
3 comments1 min readLW link

FAQ: Ex­pert Sur­vey on Progress in AI methodology

KatjaGrace31 Oct 2025 16:51 UTC
14 points
0 comments19 min readLW link
(blog.aiimpacts.org)

So­cial me­dia feeds ‘mis­al­igned’ when viewed through AI safety frame­work, show researchers

Mordechai Rorvig31 Oct 2025 16:40 UTC
13 points
3 comments1 min readLW link
(www.foommagazine.org)

Cross­word Hal­loween 2025: Man­made Horrors

jchan31 Oct 2025 16:19 UTC
7 points
0 comments1 min readLW link

De­bug­ging De­s­pair ~> A bet about Satis­fac­tion and Values

P. João31 Oct 2025 14:00 UTC
2 points
0 comments2 min readLW link

Halfhaven Digest #3

Taylor G. Lunt31 Oct 2025 13:41 UTC
7 points
0 comments2 min readLW link

OpenAI Moves To Com­plete Po­ten­tially The Largest Theft In Hu­man History

Zvi31 Oct 2025 13:20 UTC
76 points
12 comments19 min readLW link
(thezvi.wordpress.com)

A (bad) Defi­ni­tion of AGI

spookyuser31 Oct 2025 7:55 UTC
4 points
0 comments5 min readLW link

Model­ling, Mea­sur­ing, and In­ter­ven­ing on Goal-di­rected Be­havi­our in AI Systems

31 Oct 2025 1:28 UTC
14 points
0 comments8 min readLW link

Re­sam­pling Con­serves Re­dun­dancy & Me­di­a­tion (Ap­prox­i­mately) Un­der the Jensen-Shan­non Divergence

David Lorell31 Oct 2025 1:07 UTC
41 points
7 comments4 min readLW link

Cen­tral­iza­tion begets stagnation

Algon30 Oct 2025 23:49 UTC
6 points
0 comments2 min readLW link

Sum­mary and Com­ments on An­thropic’s Pilot Sab­o­tage Risk Report

GradientDissenter30 Oct 2025 20:19 UTC
29 points
0 comments5 min readLW link

Crit­i­cal Fal­li­bil­ism and The­ory of Con­straints in One An­a­lyzed Paragraph

Elliot Temple30 Oct 2025 20:06 UTC
2 points
0 comments28 min readLW link

AI #140: Try­ing To Hold The Line

Zvi30 Oct 2025 18:30 UTC
26 points
1 comment52 min readLW link
(thezvi.wordpress.com)

An­thropic’s Pilot Sab­o­tage Risk Report

dmz30 Oct 2025 17:50 UTC
32 points
2 comments3 min readLW link
(alignment.anthropic.com)

AISLE dis­cov­ered three new OpenSSL vulnerabilities

Jan_Kulveit30 Oct 2025 16:32 UTC
64 points
7 comments1 min readLW link
(aisle.com)

Son­net 4.5′s eval gam­ing se­ri­ously un­der­mines al­ign­ment evals, and this seems caused by train­ing on al­ign­ment evals

30 Oct 2025 15:34 UTC
144 points
21 comments14 min readLW link

Steer­ing Eval­u­a­tion-Aware Models to Act Like They Are Deployed

30 Oct 2025 15:03 UTC
61 points
12 comments18 min readLW link

On The Con­ser­va­tion of Rights

Roman Maksimovich30 Oct 2025 13:48 UTC
−2 points
2 comments8 min readLW link

When “HDMI-1” Lies To You

Gunnar_Zarncke30 Oct 2025 12:23 UTC
18 points
0 comments1 min readLW link

[Question] Why there is still one in­stance of Eliezer Yud­kowsky?

RomanS30 Oct 2025 12:00 UTC
−9 points
8 comments1 min readLW link

In­ter­view on the Heng­shui Model High School

L.M.Sherlock30 Oct 2025 10:26 UTC
21 points
2 comments30 min readLW link
(lmsherlock.substack.com)

Tran­scen­den­tal Ar­gu­men­ta­tion and the Epistemics of Discourse

0xA30 Oct 2025 6:37 UTC
1 point
2 comments3 min readLW link

Emer­gent In­tro­spec­tive Aware­ness in Large Lan­guage Models

Drake Thomas30 Oct 2025 4:42 UTC
130 points
19 comments1 min readLW link
(transformer-circuits.pub)

In­tro­duc­ing Aeon­isk: an Open Source Game and Dataset with Graded Out­come Tiers of Coun­ter­fac­tual Reasoning

threeriversainexus30 Oct 2025 3:02 UTC
1 point
0 comments4 min readLW link

Im­pos­si­bleBench: Mea­sur­ing Re­ward Hack­ing in LLM Cod­ing Agents

Ziqian Zhong30 Oct 2025 2:52 UTC
60 points
5 comments3 min readLW link
(arxiv.org)