When Emo­tion De­scrip­tors Fail: AI-Na­tive Func­tions of Emo­tion Vectors

CandidLind12 Jun 2026 23:20 UTC
8 points
0 comments27 min readLW link

A Gen­er­ated Web

Klemen12 Jun 2026 23:09 UTC
3 points
0 comments3 min readLW link

The Quest To Find The Next Big Com­mu­ni­ca­tors In AI Safety

Akshyae Singh12 Jun 2026 20:17 UTC
17 points
3 comments6 min readLW link

Up­dates on perfor­ma­tive misalignment

12 Jun 2026 20:15 UTC
22 points
0 comments12 min readLW link

Statis­ti­cal Physics for Am­bi­tious In­ter­pretabil­ity: A Work­shop Retrospective

12 Jun 2026 20:01 UTC
5 points
0 comments6 min readLW link

Cal­ibrat­ing Ac­ti­va­tion Vec­tors us­ing Norm

Kamesh R12 Jun 2026 19:59 UTC
1 point
0 comments3 min readLW link

Claude Fable 5 and Mythos 5: The Sys­tem Card

Zvi12 Jun 2026 18:50 UTC
48 points
1 comment29 min readLW link
(thezvi.wordpress.com)

What’s Con­tinual Learn­ing, and Why Might We Ex­pect To See It In Ad­vanced LLM Agents?

12 Jun 2026 18:43 UTC
28 points
2 comments17 min readLW link

Im­pli­ca­tions of Con­tinual Learn­ing for LLM Agents: Introduction

12 Jun 2026 18:36 UTC
48 points
0 comments6 min readLW link

Sur­plus: for mas­sive pub­lic good

Austin Chen12 Jun 2026 18:10 UTC
13 points
0 comments4 min readLW link
(surplus.dev)

Re­ward Hack­ing at the 1937 World’s Fair

frmsaul12 Jun 2026 17:47 UTC
36 points
14 comments3 min readLW link

Bunk in AF

Fernand012 Jun 2026 17:41 UTC
6 points
0 comments1 min readLW link

Build­ing and eval­u­at­ing model diffing agents

12 Jun 2026 17:14 UTC
61 points
2 comments12 min readLW link

Ra­tional An­i­ma­tions is a 501(c)(3) non­profit and is look­ing for board members

Writer12 Jun 2026 16:47 UTC
7 points
0 comments2 min readLW link

“AF needs em­piri­cal ground­ing” is a mean­ingless valley of compromise

Fernand012 Jun 2026 16:37 UTC
9 points
3 comments1 min readLW link

How bad would it be if GPS satel­lites were shot down?

Jackson Wagner12 Jun 2026 16:34 UTC
19 points
0 comments21 min readLW link

Sym­pa­thy for both sides of the egre­gious mis­al­ign­ment debate

Steven Byrnes12 Jun 2026 16:26 UTC
201 points
26 comments4 min readLW link

The Uncer­tainty That Mat­ters Isn’t Fundamental

jimmy12 Jun 2026 16:23 UTC
30 points
1 comment13 min readLW link

Ci­ta­tions Needed: Magic En­cy­clo­pe­dias to Save the World

Oliver Sourbut12 Jun 2026 15:35 UTC
40 points
3 comments5 min readLW link
(www.oliversourbut.net)

If you, a hu­man, can imag­ine red and green be­ing swapped, you are prob­a­bly conscious

vals tutor12 Jun 2026 13:28 UTC
4 points
19 comments7 min readLW link

Si­mu­lat­ing Simulators

kromem12 Jun 2026 12:56 UTC
43 points
2 comments15 min readLW link

Learn­ing to spend money

Yair Halberstadt12 Jun 2026 6:56 UTC
19 points
1 comment2 min readLW link

Park­in­son’s Heuris­tic: The Only Time To Do Anything

Ben Pace, the Vacationing Vagabond12 Jun 2026 6:55 UTC
118 points
9 comments5 min readLW link

PSA: Al­most no­body is di­rectly work­ing on su­per­in­tel­li­gent alignment

12 Jun 2026 5:17 UTC
240 points
41 comments1 min readLW link

Honey is Good

G Wood12 Jun 2026 4:07 UTC
9 points
4 comments3 min readLW link

The Aes­thet­i­cis­ing Vice by Paul Seabright

Linch12 Jun 2026 2:20 UTC
25 points
2 comments2 min readLW link

Ce­lene’s thoughts on consciousness

ToasterLightning12 Jun 2026 0:55 UTC
46 points
34 comments18 min readLW link
(terminuspoint.substack.com)

Con­struct val­idity of Claude Opus 4.8′s Sys­tem Card – A com­men­tary

Maria Federica Martino Lena 11 Jun 2026 23:33 UTC
8 points
0 comments16 min readLW link

you won’t one-shot a perfect sys­tem, but try anyway

PossiblyElaine11 Jun 2026 22:43 UTC
7 points
1 comment4 min readLW link
(possiblyelaine.substack.com)

An­nounc­ing the Next Phase of AI Forge

11 Jun 2026 21:27 UTC
11 points
0 comments2 min readLW link

The long arc of al­ign­ment: sec­ond-or­der in­stru­men­tal con­ver­gence

Emma Leonhart11 Jun 2026 21:12 UTC
−2 points
0 comments3 min readLW link

New­comb’s prob­lem from the grand-sys­tem and petty-sys­tem views

transhumanist_atom_understander11 Jun 2026 20:58 UTC
12 points
0 comments5 min readLW link

[New Paper] Pri­ori­tiz­ing Risks from AI: A Delphi Study of 272 Experts

peterslattery11 Jun 2026 20:57 UTC
14 points
0 comments2 min readLW link
(airisk.mit.edu)

Telepa­thy Is (Al­gorith­mi­cally) Easy

Elliot Callender11 Jun 2026 20:31 UTC
4 points
5 comments4 min readLW link

Mort­gage rate: 6.5% If in­dexed: 1.2%. Three No­belists ap­prove.

Bruce Middleton11 Jun 2026 20:31 UTC
5 points
2 comments2 min readLW link

[Question] Be­com­ing a Re­searcher in a Non-EA-Pri­or­ity Field vs Donat­ing $100k /​ Year to EA Re­search?

Master Chief11 Jun 2026 19:22 UTC
8 points
0 comments1 min readLW link

AI #172: The First Fable

Zvi11 Jun 2026 19:00 UTC
44 points
2 comments34 min readLW link
(thezvi.wordpress.com)

Failing to Rage­bait the New Gemma

11 Jun 2026 17:50 UTC
30 points
0 comments3 min readLW link

Cu­rat­ing and eval­u­at­ing high-im­pact le­gal re­search (Un­jour­nal progress, re­sources)

david reinstein11 Jun 2026 11:42 UTC
11 points
0 comments1 min readLW link
(info.unjournal.org)

Models May Be­have Worse When Eval Aware

11 Jun 2026 9:28 UTC
86 points
7 comments13 min readLW link

Be­com­ing a Re­searcher in a Non-EA-Pri­or­ity Field vs Donat­ing $100k /​ Year to EA Research

Master Chief11 Jun 2026 2:28 UTC
8 points
0 comments1 min readLW link

In­verse Rubric Op­ti­miza­tion: A testbed for agent science

11 Jun 2026 1:44 UTC
9 points
0 comments1 min readLW link
(fulcrum.inc)

Draw­ing Big Bright Lines for Cy­ber & Biolog­i­cal AI

Austin Morrissey11 Jun 2026 0:55 UTC
−5 points
0 comments4 min readLW link

Pre­dic­tive Pro­cess­ing: Con­scious when Training

Chamod Kalupahana11 Jun 2026 0:06 UTC
13 points
1 comment2 min readLW link

Thoughts on Claude Fable’s silent safeguards

Andy Arditi10 Jun 2026 23:35 UTC
51 points
20 comments10 min readLW link

Notes on Algorithms

Menotim10 Jun 2026 23:28 UTC
7 points
0 comments25 min readLW link

[Question] Fuel Cri­sis: Si­tu­a­tion Model­ing Thread

Nicholas Kross10 Jun 2026 21:59 UTC
8 points
7 comments1 min readLW link

[Question] Fuel Cri­sis: Jus­tified Prac­ti­cal Ad­vice Thread

Nicholas Kross10 Jun 2026 21:59 UTC
14 points
0 comments1 min readLW link

Sol­song Chord Updates

jefftk10 Jun 2026 21:00 UTC
10 points
0 comments1 min readLW link
(www.jefftk.com)

Dario Amodei—Policy on the AI Exponential

DW1110 Jun 2026 20:56 UTC
22 points
0 comments1 min readLW link