Co­or­di­nal: A Post­mortem.

Ronak_Mehta18 May 2026 20:43 UTC
37 points
3 comments4 min readLW link
(ronakrm.github.io)

Notic­ing Con­fu­sion: A prac­tice in stay­ing cu­ri­ous

vmehra18 May 2026 19:31 UTC
10 points
1 comment6 min readLW link

Dat­ing Roundup #12: Sex and Violence

Zvi18 May 2026 19:20 UTC
28 points
1 comment27 min readLW link
(thezvi.wordpress.com)

Ne­ga­tion Ne­glect: When mod­els fail to learn nega­tions in training

18 May 2026 18:37 UTC
119 points
37 comments8 min readLW link

So are you some kind of com­mu­nist?

jchan18 May 2026 15:53 UTC
5 points
1 comment3 min readLW link

Thoughts on in­ter­view­ing can­di­dates for AI safety fellowships

beyarkay (Boyd Kane)18 May 2026 15:28 UTC
34 points
4 comments7 min readLW link
(boydkane.com)

PauseAI Mu­nich Lo­cal Group Kickoff

mofeien18 May 2026 15:13 UTC
3 points
0 comments1 min readLW link

Clas­sifier Con­text Rot: Mon­i­tor Perfor­mance De­grades with Con­text Length

18 May 2026 14:05 UTC
54 points
1 comment4 min readLW link

How use­ful is cross-do­main gen­er­al­iza­tion for train­ing LLM mon­i­tors?

18 May 2026 13:52 UTC
21 points
0 comments4 min readLW link

Jhana Quick Start Guide

Zmavli Caimle18 May 2026 8:51 UTC
15 points
3 comments11 min readLW link

Links #1: 2026/​05 Part 1

papetoast18 May 2026 5:04 UTC
10 points
0 comments18 min readLW link

why pol­len aller­gies?

bhauth18 May 2026 4:44 UTC
33 points
6 comments6 min readLW link
(www.bhauth.com)

Why Phys­i­cal At­trac­tive­ness Mat­ters for Men’s Dat­ing Prospects

johnswentworth18 May 2026 2:22 UTC
9 points
13 comments3 min readLW link

Bay Sum­mer Sols­tice 2026

Raemon18 May 2026 0:34 UTC
16 points
4 comments1 min readLW link

How to Quit Fan­dom: Apostasy

Laiba Rehman ✦ RJ17 May 2026 21:09 UTC
58 points
3 comments4 min readLW link

Eng­ineer­ing a Safer World: Risk Model­ling — and Safety Eng­ineer­ing? — for AI Loss of Control

Oliver Sourbut17 May 2026 16:02 UTC
10 points
1 comment9 min readLW link
(www.oliversourbut.net)

Next To­ken Pre­dic­tion is a Mislead­ing Term

Adam Newgas17 May 2026 11:58 UTC
12 points
2 comments6 min readLW link
(www.boristhebrave.com)

Can ELK be brute-forced? In­terthe­o­retic reduction

Q Home17 May 2026 10:21 UTC
13 points
0 comments3 min readLW link

James C. Scott: See­ing Like a State

Martin Sustrik17 May 2026 8:40 UTC
56 points
6 comments7 min readLW link
(www.250bpm.com)

How to Rea­son about Your Health Issues

Taylor G. Lunt17 May 2026 5:10 UTC
23 points
28 comments5 min readLW link

Are You Not Ra­tion­al­ists?

J Thomas Moros17 May 2026 3:27 UTC
1 point
0 comments7 min readLW link

Fal­ling for the statis­ti­cal parrot

FlorianH17 May 2026 1:02 UTC
5 points
0 comments2 min readLW link

On get­ting unstuck

Joe Rogero17 May 2026 0:59 UTC
21 points
1 comment4 min readLW link
(subatomicarticles.com)

A rel­a­tively brief ex­pla­na­tion of Boltz­mann Brains

Eliezer Yudkowsky16 May 2026 21:19 UTC
206 points
155 comments4 min readLW link

Bench­mark­ing Real Work

16 May 2026 20:43 UTC
30 points
2 comments4 min readLW link

Cri­tique Sys­tems, Not Reality

Morphism16 May 2026 19:11 UTC
5 points
1 comment25 min readLW link
(thothhermes.substack.com)

Try­ing to use NLAs to find out how Qwen 2.5 7B does multiplication

Hannes Thurnherr16 May 2026 19:05 UTC
23 points
4 comments6 min readLW link

A Year Late, Claude Fi­nally Beats Poké­mon

Julian Bradshaw16 May 2026 7:05 UTC
162 points
12 comments9 min readLW link

NLA Ver­bal­iza­tions on Au­ditBench: Llama 70B

Realmbird16 May 2026 5:25 UTC
10 points
0 comments3 min readLW link

An In­tro­duc­tion to Ex­em­plar Par­ti­tion­ing for Mechanis­tic Interpretability

Jessica Rumbelow16 May 2026 3:58 UTC
69 points
7 comments11 min readLW link
(www.leap-labs.com)

An Ar­gu­ment for Analogies

James Stephen Brown16 May 2026 2:21 UTC
11 points
0 comments3 min readLW link

In­crim­i­nat­ing mis­al­igned AI mod­els via distillation

15 May 2026 21:43 UTC
115 points
12 comments5 min readLW link

Crit­i­cal Think­ing as a Gym Schedule

Alrenous15 May 2026 20:49 UTC
0 points
4 comments3 min readLW link

Why I am not too wor­ried about AIpoca­lypse: Scott Alexan­der vs Ni­co­laus Copernicus

Shmi15 May 2026 20:31 UTC
7 points
15 comments2 min readLW link

Risk re­ports need to ad­dress de­ploy­ment-time spread of misalignment

Alex Mallen15 May 2026 18:20 UTC
64 points
1 comment5 min readLW link

Monthly Roundup #42: May 2026

Zvi15 May 2026 16:50 UTC
30 points
2 comments24 min readLW link
(thezvi.wordpress.com)

Mechanis­tic es­ti­ma­tion for ex­pec­ta­tions of ran­dom products

15 May 2026 16:50 UTC
50 points
0 comments5 min readLW link
(www.alignment.org)

Clar­ify­ing the Dar­wi­nian Honeymoon

Elias Schmied15 May 2026 16:23 UTC
20 points
6 comments3 min readLW link

An­nounc­ing the Cen­ter for Shared AI Prosperity

Dylan Matthews15 May 2026 12:57 UTC
39 points
13 comments2 min readLW link

MATS 9 Ret­ro­spec­tive & Advice

beyarkay (Boyd Kane)15 May 2026 12:30 UTC
198 points
11 comments18 min readLW link
(boydkane.com)

Data Qual­ity is Way Un­der­rated, and We Should Start Fund­ing It.

Osapinion15 May 2026 4:07 UTC
4 points
0 comments2 min readLW link
(substack.com)

Don’t be too Clever to Take Ob­vi­ous Ad­vice

Hide15 May 2026 3:01 UTC
95 points
26 comments2 min readLW link
(hidefromit.substack.com)

Some ob­ser­va­tions about NLA explanations

loops15 May 2026 2:15 UTC
21 points
0 comments3 min readLW link

The hard core of al­ign­ment (is ro­bus­tify­ing RL)

Cole Wyeth15 May 2026 1:02 UTC
39 points
12 comments13 min readLW link

Con­ver­gent Ab­strac­tion Hypothesis

Jan_Kulveit15 May 2026 0:04 UTC
122 points
20 comments6 min readLW link

Emma Baker on ADHD

koratkar14 May 2026 23:29 UTC
8 points
2 comments3 min readLW link
(emma00baker.substack.com)

De­sign­ing AI fac­tual claims for “easy ver­ifi­ca­tion”

Raemon14 May 2026 23:23 UTC
33 points
17 comments2 min readLW link

Au­to­mated Align­ment is Harder Than You Think

14 May 2026 22:01 UTC
143 points
6 comments3 min readLW link
(arxiv.org)

2B scor­ing model flags out-of-do­main mis­al­ign­ment, sug­gest­ing spe­cial­ist judges have po­ten­tial for audits

burnssa14 May 2026 20:00 UTC
8 points
0 comments6 min readLW link

The safe-to-dan­ger­ous shift is a fun­da­men­tal prob­lem for eval re­al­ism; but also for mea­sur­ing awareness

14 May 2026 17:05 UTC
59 points
3 comments3 min readLW link