RSS

5 Things I Learned About Peo­ple From Do­ing Stand-Up Comedy

Luise Woehlke9 Jun 2026 15:52 UTC
5 points
0 comments2 min readLW link
(open.substack.com)

The Machines Lack Honour

Raymond Douglas9 Jun 2026 15:30 UTC
34 points
2 comments12 min readLW link

[Linkpost] Evals for “SPI-in­com­pat­i­ble” be­hav­ior & rea­son­ing: Guide to ini­tial research

Anthony DiGiovanni9 Jun 2026 13:44 UTC
23 points
0 comments1 min readLW link
(docs.google.com)

Sub­ver­sion-Re­sis­tance for Free from For­mal Verification

Adam Chlipala9 Jun 2026 12:01 UTC
7 points
0 comments7 min readLW link

LLMs and al­most good code

kqr9 Jun 2026 7:21 UTC
29 points
6 comments3 min readLW link
(entropicthoughts.com)

On Slop

Jan9 Jun 2026 1:08 UTC
25 points
1 comment7 min readLW link
(universalprior.substack.com)

How to build a can­cer vac­cine, and whether they will work this time

Abhishaike Mahajan8 Jun 2026 20:45 UTC
48 points
3 comments25 min readLW link
(www.owlposting.com)

Effi­cient trade­offs and the safety-use­ful­ness trade­off model

Buck8 Jun 2026 20:28 UTC
45 points
0 comments8 min readLW link

Ac­cel­er­ated Skill Learn­ing via Dream Eng­ineer­ing and Biofeedback

Elliot Callender8 Jun 2026 20:08 UTC
5 points
2 comments3 min readLW link

How valuable are weak AI safety reg­u­la­tions?

MichaelDickens8 Jun 2026 18:24 UTC
27 points
0 comments6 min readLW link

How to re­duce ca­pa­bil­ity degra­da­tion from off-model SFT

8 Jun 2026 16:24 UTC
21 points
0 comments3 min readLW link

The Next Swan: Frank Ram­sey, Vari­able Hy­po­thet­i­cals, and the Bet on Induction

Ramseyian8 Jun 2026 12:01 UTC
4 points
0 comments18 min readLW link

Cover­age-driven al­ign­ment—What ‘Teach­ing Claude Why’ can bor­row from AV verification

Yoav Hollander8 Jun 2026 11:42 UTC
16 points
2 comments14 min readLW link
(blog.foretellix.com)

Bun’s Mi­gra­tion from Zig to Rust as a Po­ten­tial Case Study for Grad­ual Disempowerment

Sayhan Yalvaçer8 Jun 2026 7:06 UTC
78 points
6 comments3 min readLW link

Men­tal cau­sa­tion is not load-bearing

jessicata7 Jun 2026 20:43 UTC
31 points
2 comments10 min readLW link

How Far Apart Does a Model Think Its To­kens Are?

Brendan Long7 Jun 2026 20:20 UTC
45 points
7 comments9 min readLW link

Au­topi­lot Thinking

XelaP7 Jun 2026 20:20 UTC
10 points
4 comments6 min readLW link

Se­cret Loy­alties Likely Raise Re­mote-Influenceability

Kaustubh Kislay7 Jun 2026 17:51 UTC
13 points
0 comments6 min readLW link

From One Piece to One Pace - Vi­sion and mis­sion in tem­po­rary co­or­di­na­tion of agents

a unemployed pastor- de S Brito7 Jun 2026 17:07 UTC
4 points
0 comments3 min readLW link

Ne­glected Ba­sics of AI Alignment

Quirinus_Quirrell7 Jun 2026 9:02 UTC
28 points
2 comments6 min readLW link