Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
gasteigerjo
Karma:
222
Working on Alignment Science at Anthropic
All
Posts
Comments
New
Top
Old
Paper Highlights, April ’25
gasteigerjo
May 6, 2025, 2:22 PM
3
points
0
comments
7
min read
LW
link
(aisafetyfrontier.substack.com)
Paper Highlights, March ’25
gasteigerjo
Apr 7, 2025, 8:17 PM
8
points
0
comments
9
min read
LW
link
(aisafetyfrontier.substack.com)
Automated Researchers Can Subtly Sandbag
gasteigerjo
,
Akbir Khan
,
Sam Bowman
,
Vlad Mikulik
,
Ethan Perez
and
Fabien Roger
Mar 26, 2025, 7:13 PM
44
points
0
comments
4
min read
LW
link
(alignment.anthropic.com)
AI Safety at the Frontier: Paper Highlights, February ’25
gasteigerjo
Mar 3, 2025, 10:09 PM
7
points
0
comments
7
min read
LW
link
(aisafetyfrontier.substack.com)
AI Safety at the Frontier: Paper Highlights, January ’25
gasteigerjo
Feb 11, 2025, 4:14 PM
7
points
0
comments
8
min read
LW
link
(aisafetyfrontier.substack.com)
AI Safety at the Frontier: Paper Highlights, December ’24
gasteigerjo
Jan 11, 2025, 10:54 PM
7
points
2
comments
7
min read
LW
link
(aisafetyfrontier.substack.com)
Paper Highlights, November ’24
gasteigerjo
Dec 7, 2024, 7:15 PM
7
points
0
comments
8
min read
LW
link
(aisafetyfrontier.substack.com)
AI Safety at the Frontier: Paper Highlights, October ’24
gasteigerjo
Oct 31, 2024, 12:09 AM
3
points
0
comments
9
min read
LW
link
(aisafetyfrontier.substack.com)
AI Safety at the Frontier: Paper Highlights, September ’24
gasteigerjo
Oct 2, 2024, 9:49 AM
13
points
0
comments
7
min read
LW
link
(aisafetyfrontier.substack.com)
AI Safety at the Frontier: Paper Highlights, August ’24
gasteigerjo
Sep 3, 2024, 7:17 PM
28
points
0
comments
6
min read
LW
link
(aisafetyfrontier.substack.com)
AI Safety at the Frontier: Paper Highlights, July ’24
gasteigerjo
Aug 5, 2024, 1:00 PM
8
points
0
comments
7
min read
LW
link
(aisafetyfrontier.substack.com)
Discussion: Challenges with Unsupervised LLM Knowledge Discovery
Seb Farquhar
,
Vikrant Varma
,
zac_kenton
,
gasteigerjo
,
Vlad Mikulik
and
Rohin Shah
Dec 18, 2023, 11:58 AM
147
points
21
comments
10
min read
LW
link
Back to top