RSS

An Em­piri­cal Re­view of the An­i­mal Harm Benchmark

lukasgebhard1 Mar 2026 18:20 UTC
6 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

In­tro­duc­ing and Dep­re­cat­ing WoFBench

jefftk1 Mar 2026 18:20 UTC
30 points
0 comments3 min readLW link
(www.jefftk.com)

I’m Bear­ish On Per­sonas For ASI Safety

J Bostock1 Mar 2026 16:22 UTC
54 points
9 comments10 min readLW link

Con­tin­u­ously In­te­grat­ing Feel­ings: pro­cess­ing feel­ings mo­ment to mo­ment for re­flec­tively sta­ble policy changes

Johannes C. Mayer1 Mar 2026 10:47 UTC
11 points
0 comments19 min readLW link

Tools to gen­er­ate re­al­is­tic prompts help sur­pris­ingly lit­tle with Petri au­dit realism

1 Mar 2026 8:18 UTC
11 points
0 comments7 min readLW link

Pe­tapixel cam­eras won’t ex­ist soon

samuelshadrach1 Mar 2026 7:40 UTC
−2 points
5 comments3 min readLW link
(samuelshadrach.com)

“Fib­bers’ fore­casts are worth­less”

Random Developer28 Feb 2026 22:07 UTC
38 points
0 comments2 min readLW link
(dsquareddigest.wordpress.com)

Bury­ing a Changeling into Foun­da­tion of Tower of Knowledge

siarshai28 Feb 2026 21:01 UTC
14 points
0 comments6 min readLW link

AI slop is a ve­gan hamburger

pku28 Feb 2026 18:53 UTC
11 points
3 comments2 min readLW link
(shakeddown.substack.com)

Schel­ling Good­ness, and Shared Mo­ral­ity as a Goal

Andrew_Critch28 Feb 2026 4:25 UTC
81 points
25 comments39 min readLW link

Jhana 0

14285728 Feb 2026 3:57 UTC
14 points
1 comment8 min readLW link

Mind­scapes and Mind Palaces

Moon Lesbian28 Feb 2026 1:04 UTC
5 points
15 comments1 min readLW link

Linkpost: “Lithium Prevents Alzheimer’s—Here’s How to Use It”

Jackson Wagner28 Feb 2026 0:50 UTC
−2 points
3 comments2 min readLW link
(jonbrudvig.substack.com)

The Topol­ogy of LLM Behavior

Quentin FEUILLADE--MONTIXI28 Feb 2026 0:36 UTC
28 points
1 comment5 min readLW link
(weavemind.ai)

Co­her­ent Care

abramdemski27 Feb 2026 21:59 UTC
31 points
0 comments16 min readLW link

The tick in my back

benjamin ar27 Feb 2026 21:49 UTC
12 points
0 comments4 min readLW link
(bjar.substack.com)

Side by Side Com­par­i­son of RSP Versions

Corm27 Feb 2026 21:11 UTC
9 points
0 comments1 min readLW link

Ball+Grav­ity has a “Down­hill” Preference

TristanTrim27 Feb 2026 19:12 UTC
6 points
0 comments2 min readLW link

Safe ASI Is Achiev­able: The Finite Game Argument

Lester Leong27 Feb 2026 18:50 UTC
7 points
7 comments22 min readLW link

New ARENA ma­te­rial: 8 ex­er­cise sets on al­ign­ment sci­ence & interpretability

CallumMcDougall27 Feb 2026 17:37 UTC
87 points
1 comment7 min readLW link