Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
New
Hot
Active
Old
Page
1
5 Things I Learned About People From Doing Stand-Up Comedy
Luise Woehlke
9 Jun 2026 15:52 UTC
5
points
0
comments
2
min read
LW
link
(open.substack.com)
The Machines Lack Honour
Raymond Douglas
9 Jun 2026 15:30 UTC
34
points
2
comments
12
min read
LW
link
[Linkpost] Evals for “SPI-incompatible” behavior & reasoning: Guide to initial research
Anthony DiGiovanni
9 Jun 2026 13:44 UTC
23
points
0
comments
1
min read
LW
link
(docs.google.com)
Subversion-Resistance for Free from Formal Verification
Adam Chlipala
9 Jun 2026 12:01 UTC
7
points
0
comments
7
min read
LW
link
LLMs and almost good code
kqr
9 Jun 2026 7:21 UTC
29
points
6
comments
3
min read
LW
link
(entropicthoughts.com)
On Slop
Jan
9 Jun 2026 1:08 UTC
25
points
1
comment
7
min read
LW
link
(universalprior.substack.com)
How to build a cancer vaccine, and whether they will work this time
Abhishaike Mahajan
8 Jun 2026 20:45 UTC
48
points
3
comments
25
min read
LW
link
(www.owlposting.com)
Efficient tradeoffs and the safety-usefulness tradeoff model
Buck
8 Jun 2026 20:28 UTC
45
points
0
comments
8
min read
LW
link
Accelerated Skill Learning via Dream Engineering and Biofeedback
Elliot Callender
8 Jun 2026 20:08 UTC
5
points
2
comments
3
min read
LW
link
How valuable are weak AI safety regulations?
MichaelDickens
8 Jun 2026 18:24 UTC
27
points
0
comments
6
min read
LW
link
How to reduce capability degradation from off-model SFT
Dylan Xu
,
SebastianP
and
Alek Westover
8 Jun 2026 16:24 UTC
21
points
0
comments
3
min read
LW
link
The Next Swan: Frank Ramsey, Variable Hypotheticals, and the Bet on Induction
Ramseyian
8 Jun 2026 12:01 UTC
4
points
0
comments
18
min read
LW
link
Coverage-driven alignment—What ‘Teaching Claude Why’ can borrow from AV verification
Yoav Hollander
8 Jun 2026 11:42 UTC
16
points
2
comments
14
min read
LW
link
(blog.foretellix.com)
Bun’s Migration from Zig to Rust as a Potential Case Study for Gradual Disempowerment
Sayhan Yalvaçer
8 Jun 2026 7:06 UTC
78
points
6
comments
3
min read
LW
link
Mental causation is not load-bearing
jessicata
7 Jun 2026 20:43 UTC
31
points
2
comments
10
min read
LW
link
How Far Apart Does a Model Think Its Tokens Are?
Brendan Long
7 Jun 2026 20:20 UTC
45
points
7
comments
9
min read
LW
link
Autopilot Thinking
XelaP
7 Jun 2026 20:20 UTC
10
points
4
comments
6
min read
LW
link
Secret Loyalties Likely Raise Remote-Influenceability
Kaustubh Kislay
7 Jun 2026 17:51 UTC
13
points
0
comments
6
min read
LW
link
From One Piece to One Pace - Vision and mission in temporary coordination of agents
a unemployed pastor- de S Brito
7 Jun 2026 17:07 UTC
4
points
0
comments
3
min read
LW
link
Neglected Basics of AI Alignment
Quirinus_Quirrell
7 Jun 2026 9:02 UTC
28
points
2
comments
6
min read
LW
link
Back to top
Next