Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
New
Hot
Active
Old
Page
1
An Empirical Review of the Animal Harm Benchmark
lukasgebhard
1 Mar 2026 18:20 UTC
6
points
0
comments
1
min read
LW
link
(forum.effectivealtruism.org)
Introducing and Deprecating WoFBench
jefftk
1 Mar 2026 18:20 UTC
30
points
0
comments
3
min read
LW
link
(www.jefftk.com)
I’m Bearish On Personas For ASI Safety
J Bostock
1 Mar 2026 16:22 UTC
54
points
9
comments
10
min read
LW
link
Continuously Integrating Feelings: processing feelings moment to moment for reflectively stable policy changes
Johannes C. Mayer
1 Mar 2026 10:47 UTC
11
points
0
comments
19
min read
LW
link
Tools to generate realistic prompts help surprisingly little with Petri audit realism
Connor Kissane
,
Monte M
and
Fabien Roger
1 Mar 2026 8:18 UTC
11
points
0
comments
7
min read
LW
link
Petapixel cameras won’t exist soon
samuelshadrach
1 Mar 2026 7:40 UTC
−2
points
5
comments
3
min read
LW
link
(samuelshadrach.com)
“Fibbers’ forecasts are worthless”
Random Developer
28 Feb 2026 22:07 UTC
38
points
0
comments
2
min read
LW
link
(dsquareddigest.wordpress.com)
Burying a Changeling into Foundation of Tower of Knowledge
siarshai
28 Feb 2026 21:01 UTC
14
points
0
comments
6
min read
LW
link
AI slop is a vegan hamburger
pku
28 Feb 2026 18:53 UTC
11
points
3
comments
2
min read
LW
link
(shakeddown.substack.com)
Schelling Goodness, and Shared Morality as a Goal
Andrew_Critch
28 Feb 2026 4:25 UTC
81
points
25
comments
39
min read
LW
link
Jhana 0
142857
28 Feb 2026 3:57 UTC
14
points
1
comment
8
min read
LW
link
Mindscapes and Mind Palaces
Moon Lesbian
28 Feb 2026 1:04 UTC
5
points
15
comments
1
min read
LW
link
Linkpost: “Lithium Prevents Alzheimer’s—Here’s How to Use It”
Jackson Wagner
28 Feb 2026 0:50 UTC
−2
points
3
comments
2
min read
LW
link
(jonbrudvig.substack.com)
The Topology of LLM Behavior
Quentin FEUILLADE--MONTIXI
28 Feb 2026 0:36 UTC
28
points
1
comment
5
min read
LW
link
(weavemind.ai)
Coherent Care
abramdemski
27 Feb 2026 21:59 UTC
31
points
0
comments
16
min read
LW
link
The tick in my back
benjamin ar
27 Feb 2026 21:49 UTC
12
points
0
comments
4
min read
LW
link
(bjar.substack.com)
Side by Side Comparison of RSP Versions
Corm
27 Feb 2026 21:11 UTC
9
points
0
comments
1
min read
LW
link
Ball+Gravity has a “Downhill” Preference
TristanTrim
27 Feb 2026 19:12 UTC
6
points
0
comments
2
min read
LW
link
Safe ASI Is Achievable: The Finite Game Argument
Lester Leong
27 Feb 2026 18:50 UTC
7
points
7
comments
22
min read
LW
link
New ARENA material: 8 exercise sets on alignment science & interpretability
CallumMcDougall
27 Feb 2026 17:37 UTC
87
points
1
comment
7
min read
LW
link
Back to top
Next