Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Geoffrey Irving
Karma:
97
All
Posts
Comments
New
Top
Old
Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Neel Nanda
,
Tom Lieberum
,
Matthew Rahtz
,
János Kramár
,
Geoffrey Irving
,
Rohin Shah
and
Vlad Mikulik
20 Jul 2023 10:50 UTC
43
points
3
comments
2
min read
LW
link
(arxiv.org)
DeepMind is hiring for the Scalable Alignment and Alignment Teams
Rohin Shah
and
Geoffrey Irving
13 May 2022 12:17 UTC
150
points
34
comments
9
min read
LW
link
Learning the smooth prior
Geoffrey Irving
,
Rohin Shah
and
evhub
29 Apr 2022 21:10 UTC
35
points
0
comments
12
min read
LW
link
Back to top