An­thropic is (prob­a­bly) not meet­ing its RSP se­cu­rity commitments

habryka18 Nov 2025 23:34 UTC
129 points
22 comments5 min readLW link

Con­sid­er­a­tions for set­ting the FLOP thresh­olds in our ex­am­ple in­ter­na­tional AI agree­ment

18 Nov 2025 23:31 UTC
54 points
5 comments7 min readLW link

Jailbreak­ing AI mod­els to Phish Elderly Victims

18 Nov 2025 23:17 UTC
17 points
0 comments2 min readLW link
(simonlermen.substack.com)

Beren’s Es­say on Obe­di­ence and Alignment

StanislavKrym18 Nov 2025 22:50 UTC
33 points
0 comments9 min readLW link
(www.beren.io)

Towards A Unified The­ory Of Alignment

kenneth myers18 Nov 2025 22:03 UTC
4 points
3 comments4 min readLW link

[Question] Why are FICO scores effec­tive?

Hruss18 Nov 2025 21:53 UTC
8 points
3 comments2 min readLW link

Bologna De­cem­ber Meetup

Luca Petrolati18 Nov 2025 20:19 UTC
3 points
0 comments1 min readLW link

The Aura of a Dark Lord

Dentosal18 Nov 2025 20:07 UTC
25 points
0 comments3 min readLW link

Read­ing LLM chain of thought makes me more rational

Michael Steele18 Nov 2025 19:53 UTC
1 point
0 comments1 min readLW link

New Re­port: An In­ter­na­tional Agree­ment to Prevent the Pre­ma­ture Creation of Ar­tifi­cial Superintelligence

18 Nov 2025 19:09 UTC
223 points
23 comments3 min readLW link

Sign lan­guage as a gen­er­ally-use­ful means of com­mu­ni­ca­tion (even if you have good hear­ing)

beyarkay (Boyd Kane)18 Nov 2025 18:34 UTC
7 points
2 comments1 min readLW link
(boydkane.com)

Vic­tor Taelin’s notes on Gem­ini 3

Gunnar_Zarncke18 Nov 2025 18:30 UTC
32 points
1 comment3 min readLW link
(x.com)

On Writ­ing #2

Zvi18 Nov 2025 17:30 UTC
46 points
4 comments14 min readLW link
(thezvi.wordpress.com)

GPT 5.1 Fol­lows Cus­tom In­struc­tions and Glazes

Zvi18 Nov 2025 17:30 UTC
28 points
1 comment20 min readLW link
(thezvi.wordpress.com)

ARC progress up­date: Com­pet­ing with sampling

18 Nov 2025 17:22 UTC
131 points
11 comments21 min readLW link

Sta­tus Is The Game Of The Losers’ Bracket

johnswentworth18 Nov 2025 17:08 UTC
94 points
48 comments4 min readLW link

Kairos is the new home for the Global Challenges Pro­ject, and we’re hiring for a GCP Direc­tor

18 Nov 2025 13:54 UTC
6 points
0 comments1 min readLW link

Re­con­stel­la­tion: con­struct a fly­wheel for per­sonal change

teebarnett18 Nov 2025 12:30 UTC
13 points
2 comments12 min readLW link

The Illeg­ible Chain-of-Thought Menagerie

Artem Karpov18 Nov 2025 12:01 UTC
3 points
0 comments8 min readLW link

A Call for Bet­ter Risk Modelling

18 Nov 2025 9:08 UTC
20 points
0 comments4 min readLW link

Eat The Richtext

dreeves18 Nov 2025 7:57 UTC
46 points
1 comment2 min readLW link

Me­mories of a Bri­tish Board­ing School #1

Ben Pace18 Nov 2025 7:57 UTC
36 points
0 comments5 min readLW link

Prefer­ence Weight­ing and the Abilene Paradox

Screwtape18 Nov 2025 7:56 UTC
29 points
1 comment8 min readLW link

Don’t grow your org fast

Ruby18 Nov 2025 7:47 UTC
19 points
2 comments9 min readLW link

Continuity

abramdemski18 Nov 2025 5:59 UTC
27 points
4 comments3 min readLW link

How Colds Spread

RobertM18 Nov 2025 5:25 UTC
248 points
32 comments10 min readLW link

Aim for sin­gle piece flow

habryka18 Nov 2025 5:22 UTC
123 points
21 comments5 min readLW link

I store some mem­o­ries spa­tially and I don’t know why

Alex_Altair18 Nov 2025 2:54 UTC
11 points
3 comments2 min readLW link
(namelessvirtue.com)

An Analogue Of Set Re­la­tion­ships For Distributions

18 Nov 2025 1:03 UTC
53 points
4 comments3 min readLW link

No One Reads the Origi­nal Work

Algon18 Nov 2025 0:00 UTC
51 points
10 comments2 min readLW link

Mid­dle­men Are Eat­ing the World (And That’s Good, Ac­tu­ally)

Linch17 Nov 2025 22:26 UTC
48 points
4 comments4 min readLW link
(inchpin.substack.com)

[Question] Are there ex­am­ples of com­mu­ni­ties where AI is mak­ing epistemics bet­ter now?

Ben Goldhaber17 Nov 2025 21:47 UTC
18 points
0 comments2 min readLW link

Gen­er­al­i­sa­tion Hack­ing: a first look at ad­ver­sar­ial gen­er­al­i­sa­tion failures in de­liber­a­tive alignment

17 Nov 2025 21:44 UTC
54 points
2 comments8 min readLW link

Va­ri­eties Of Doom

jdp17 Nov 2025 21:36 UTC
173 points
70 comments57 min readLW link
(minihf.com)

Om­ni­science one bit at a time: Chap­ter 5

Dentosal17 Nov 2025 21:31 UTC
9 points
1 comment2 min readLW link

The Bar­ri­ers to Your Unemployment

claywren17 Nov 2025 21:08 UTC
9 points
0 comments7 min readLW link

Thoughts and ex­pe­riences on us­ing AI for learning

Mitali M17 Nov 2025 21:07 UTC
6 points
0 comments1 min readLW link

Cool­ing the brain to boost hu­man IQ

Michael Steele17 Nov 2025 21:02 UTC
8 points
10 comments3 min readLW link

AI 2025 - Last Shipmas

Simon Lermen17 Nov 2025 19:39 UTC
66 points
5 comments7 min readLW link

Know­ing Whether AI Align­ment Is a One-Shot Prob­lem Is a One-Shot Problem

MichaelDickens17 Nov 2025 19:11 UTC
32 points
2 comments3 min readLW link

Les­sons from build­ing a model or­ganism testbed

17 Nov 2025 17:58 UTC
22 points
1 comment14 min readLW link

# How the Crypto Bros and Poker Pros Blew the Whis­tle on UFOs. Pre­dic­tion by @Grok, xAI Jan­uary 2026

Krantz17 Nov 2025 16:19 UTC
−19 points
0 comments2 min readLW link

Close open loops

habryka17 Nov 2025 16:00 UTC
62 points
0 comments3 min readLW link

Lob­sang’s Children

Tomás B.17 Nov 2025 15:12 UTC
61 points
0 comments23 min readLW link

50 Shades of Red

Aprillion17 Nov 2025 13:52 UTC
4 points
0 comments3 min readLW link

75 and 750 Words on Le­gal Personhood

Stephen Martin17 Nov 2025 13:50 UTC
21 points
0 comments3 min readLW link

Con­sid­er­a­tions re­gard­ing be­ing nice to AIs

MattAlexander17 Nov 2025 13:05 UTC
8 points
0 comments15 min readLW link

A Mar­ket of Whisper­ing Earrings

mrmoxon17 Nov 2025 13:02 UTC
2 points
0 comments2 min readLW link

Hu­man be­hav­ior is an in­tu­ition-pump for AI risk

invertedpassion17 Nov 2025 11:46 UTC
4 points
0 comments16 min readLW link

On Com­par­a­tive Ad­van­tage & AGI

CharlesD17 Nov 2025 9:33 UTC
11 points
0 comments3 min readLW link