Time­less Engineering

Jack Bradshaw11 Feb 2026 23:53 UTC
−14 points
0 comments5 min readLW link

[Paper] How does in­for­ma­tion ac­cess af­fect LLM mon­i­tors’ abil­ity to de­tect sab­o­tage?

11 Feb 2026 21:25 UTC
26 points
0 comments6 min readLW link

Claude Opus 4.6 Es­ca­lates Things Quickly

Zvi11 Feb 2026 21:20 UTC
51 points
0 comments34 min readLW link
(thezvi.wordpress.com)

Where Will Call Cen­ter Work­ers Go?

loic11 Feb 2026 20:44 UTC
19 points
2 comments4 min readLW link

Dist­in­guish be­tween in­fer­ence scal­ing and “larger tasks use more com­pute”

ryan_greenblatt11 Feb 2026 18:37 UTC
87 points
5 comments2 min readLW link

Mon­i­tor Jailbreak­ing: Evad­ing Chain-of-Thought Mon­i­tor­ing Without En­coded Reasoning

Wuschel Schulz11 Feb 2026 17:18 UTC
61 points
17 comments5 min readLW link

[Hiring] Prin­cipia Re­search Fellows

11 Feb 2026 16:30 UTC
35 points
1 comment3 min readLW link

The SaaS Blood­bath: the Op­por­tu­ni­ties and Per­ils for Investors

ykevinzhang11 Feb 2026 16:17 UTC
0 points
0 comments4 min readLW link

On Re­solv­ing the Great Matter

Gordon Seidoh Worley11 Feb 2026 15:30 UTC
11 points
7 comments3 min readLW link
(www.uncertainupdates.com)

Is a con­sti­tu­tion a “no­ble lie”?

SpectrumDT11 Feb 2026 15:08 UTC
4 points
10 comments2 min readLW link

Jevons Burnout

Kemp11 Feb 2026 13:29 UTC
−3 points
1 comment1 min readLW link

Strate­gic aware­ness tools: de­sign sketches

11 Feb 2026 12:28 UTC
18 points
2 comments1 min readLW link
(www.forethought.org)

In­tro­spec­tive RSI vs Ex­tro­spec­tive RSI

Cleo Nardo11 Feb 2026 11:54 UTC
10 points
6 comments2 min readLW link

[Question] What con­crete mechanisms could lead to AI mod­els hav­ing open-ended goals?

Jemal Young11 Feb 2026 9:08 UTC
10 points
4 comments1 min readLW link

Is Every­thing Con­nected? A McLuhan Thought Experiment

R0sberg11 Feb 2026 6:04 UTC
2 points
0 comments6 min readLW link

De­sign­ing Pre­dic­tion Markets

ToasterLightning11 Feb 2026 5:38 UTC
58 points
6 comments7 min readLW link

punc­tilio: the best text prettifier

TurnTrout11 Feb 2026 4:49 UTC
24 points
0 comments5 min readLW link
(github.com)

LessOn­line 2026: June 5-7, Berkeley, CA (save the date)

Ruby11 Feb 2026 0:15 UTC
56 points
7 comments1 min readLW link
(Less.Online)

Build­ing a Regex Eng­ine with a team of par­allel Claudes

kian11 Feb 2026 0:08 UTC
2 points
2 comments1 min readLW link
(kiankyars.github.io)

My jour­ney to the microwave al­ter­nate timeline

Malmesbury10 Feb 2026 17:59 UTC
782 points
58 comments10 min readLW link

Stress-Test­ing Align­ment Au­dits With Prompt-Level Strate­gic Deception

10 Feb 2026 17:29 UTC
16 points
0 comments1 min readLW link
(arxiv.org)

Heuris­tics for lab robotics, and where its fu­ture may go

Abhishaike Mahajan10 Feb 2026 17:13 UTC
79 points
4 comments28 min readLW link
(www.owlposting.com)

On Meta-Level Ad­ver­sar­ial Eval­u­a­tions of (White-Box) Align­ment Auditing

Oliver Daniels10 Feb 2026 17:06 UTC
27 points
5 comments3 min readLW link

LLMs Views on Philos­o­phy 2026

JonathanErhardt10 Feb 2026 16:12 UTC
35 points
3 comments1 min readLW link

Claude Opus 4.6: Sys­tem Card Part 2: Fron­tier Alignment

Zvi10 Feb 2026 16:10 UTC
46 points
0 comments18 min readLW link
(thezvi.wordpress.com)

Cop­ing with Deconversion

Benjamin Hendricks10 Feb 2026 13:26 UTC
21 points
22 comments1 min readLW link

“Re­cur­sive Self-Im­prove­ment” Is Three Differ­ent Things

Ihor Kendiukhov10 Feb 2026 12:49 UTC
25 points
6 comments2 min readLW link

SAE Fea­ture Match­mak­ing (Layer-to-Layer)

Mitali M10 Feb 2026 4:32 UTC
9 points
0 comments1 min readLW link

Mon­day AI Radar #12

Against Moloch10 Feb 2026 4:28 UTC
16 points
1 comment7 min readLW link
(againstmoloch.com)

End­ing Park­ing Space Saving

jefftk10 Feb 2026 2:30 UTC
26 points
4 comments2 min readLW link
(www.jefftk.com)

[Question] Should we con­sider Meta to be a crim­i­nal en­ter­prise?

ChristianKl10 Feb 2026 2:10 UTC
43 points
23 comments1 min readLW link

[Question] OK, what’s the differ­ence be­tween co­her­ence and rep­re­sen­ta­tion the­o­rems?

Algon10 Feb 2026 0:45 UTC
15 points
7 comments2 min readLW link

In­tro­spec­tive In­ter­pretabil­ity: a Defi­ni­tion, Mo­ti­va­tion, and Open Problems

Belinda Li9 Feb 2026 23:53 UTC
10 points
0 comments13 min readLW link

Job List­ing (Closed): CBAI Oper­a­tions Associate

9 Feb 2026 23:36 UTC
1 point
0 comments1 min readLW link

Weight-Sparse Cir­cuits May Be In­ter­pretable Yet Unfaithful

jacob_drori9 Feb 2026 23:25 UTC
136 points
5 comments8 min readLW link

Gw­ern’s 2025 Inkhaven Writ­ing Interview

gwern9 Feb 2026 22:11 UTC
49 points
2 comments31 min readLW link
(gwern.net)

Claude Opus 4.6: Sys­tem Card Part 1: Mun­dane Align­ment and Model Welfare

Zvi9 Feb 2026 21:30 UTC
36 points
5 comments26 min readLW link
(thezvi.wordpress.com)

Closure

Vadim Golub9 Feb 2026 21:17 UTC
3 points
0 comments2 min readLW link

Aure­lius: Propos­ing Align­ment as an Emer­gent Property

Austin McCaffrey9 Feb 2026 20:13 UTC
−5 points
0 comments1 min readLW link
(github.com)

Distributed vs cen­tral­ized agents

Richard_Ngo9 Feb 2026 20:06 UTC
51 points
9 comments1 min readLW link

Stone Age Billion­aire Can’t Words Good

Eneasz9 Feb 2026 18:51 UTC
169 points
95 comments12 min readLW link
(deathisbad.substack.com)

Do Models Con­tinue Misal­igned Ac­tions? [eval]

Jordan Taylor9 Feb 2026 16:59 UTC
76 points
12 comments11 min readLW link

the ex­traor­di­nary as mundane

Derek DeHart9 Feb 2026 16:26 UTC
3 points
2 comments5 min readLW link
(dehart.substack.com)

Large Lan­guage Models Live in Time

Eleni Angelou9 Feb 2026 15:08 UTC
20 points
2 comments4 min readLW link

Sym­pa­thy for the Model, or, Welfare Con­cerns as Takeover Risk

J Bostock9 Feb 2026 14:19 UTC
42 points
37 comments3 min readLW link

Opus 4.6 Rea­son­ing Doesn’t Ver­bal­ize Align­ment Fak­ing, but Be­hav­ior Persists

9 Feb 2026 12:55 UTC
118 points
13 comments8 min readLW link

Does an AI So­ciety Need an Im­mune Sys­tem? Ac­cept­ing Yam­polskiy’s Im­pos­si­bil­ity Results

Hiroshi Yamakawa9 Feb 2026 12:32 UTC
13 points
0 comments10 min readLW link

Can Hard­ware Save Us from Soft­ware?

Alvin Ånestrand9 Feb 2026 11:57 UTC
23 points
2 comments12 min readLW link
(forecastingaifutures.substack.com)

Com­plex­ity Science as Bridge to Eastern Philosophy

pchvykov9 Feb 2026 10:40 UTC
1 point
2 comments2 min readLW link

De­sign sketches for a more sen­si­ble world

9 Feb 2026 10:22 UTC
26 points
2 comments4 min readLW link
(www.forethought.org)