Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Subhash Kantamneni
Karma:
292
All
Posts
Comments
New
Top
Old
Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers
Sam Marks
,
Adam Karvonen
,
James Chua
,
Subhash Kantamneni
,
Euan Ong
,
Julian Minder
,
Clément Dumas
and
Owain_Evans
18 Dec 2025 20:21 UTC
153
points
11
comments
8
min read
LW
link
(arxiv.org)
Scaling Laws for Scalable Oversight
Subhash Kantamneni
,
Josh Engels
,
David Baek
and
Max Tegmark
30 Apr 2025 12:13 UTC
38
points
1
comment
9
min read
LW
link
Takeaways From Our Recent Work on SAE Probing
Josh Engels
,
Subhash Kantamneni
,
Senthooran Rajamanoharan
and
Neel Nanda
3 Mar 2025 19:50 UTC
30
points
4
comments
5
min read
LW
link
Language Models Use Trigonometry to Do Addition
Subhash Kantamneni
5 Feb 2025 13:50 UTC
80
points
1
comment
10
min read
LW
link
SAE Probing: What is it good for?
Subhash Kantamneni
,
Josh Engels
,
Senthooran Rajamanoharan
and
Neel Nanda
1 Nov 2024 19:23 UTC
34
points
0
comments
11
min read
LW
link
Back to top