Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
mikes
Karma:
136
All
Posts
Comments
New
Top
Old
Fluent dreaming for language models (AI interpretability method)
tbenthompson
,
mikes
and
Zygi Straznickas
6 Feb 2024 6:02 UTC
39
points
4
comments
1
min read
LW
link
(arxiv.org)
Takeaways from the NeurIPS 2023 Trojan Detection Competition
mikes
13 Jan 2024 12:35 UTC
20
points
2
comments
1
min read
LW
link
(confirmlabs.org)
[Question]
The literature on aluminum adjuvants is very suspicious. Small IQ tax is plausible—can any experts help me estimate it?
mikes
4 Jul 2023 9:33 UTC
58
points
39
comments
3
min read
LW
link
Back to top