Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Tyler Tracy
Karma:
508
Redwood Research. Paperclip miminzer
All
Posts
Comments
New
Top
Old
A Pipeline for Generating Synthetic Sabotage Trajectories to Red-Team Monitors
Myles H
and
Tyler Tracy
3 Jun 2026 20:33 UTC
9
points
0
comments
12
min read
LW
link
Introducing LinuxArena
Tyler Tracy
,
Ram Potham
,
Nick Kuhn
and
Myles H
20 Apr 2026 22:00 UTC
80
points
2
comments
4
min read
LW
link
Refactor Arena: A Control Setting for Software Engineering
fastfedora
and
Tyler Tracy
18 Apr 2026 2:57 UTC
17
points
2
comments
25
min read
LW
link
Attack Selection In Agentic AI Control Evals Can Decrease Safety
Cath Ge-Wang
,
Tyler Crosse
,
hadad
,
Ram Potham
and
Tyler Tracy
14 Apr 2026 18:02 UTC
24
points
3
comments
18
min read
LW
link
Chat, is this sus?
Tyler Tracy
1 Apr 2026 17:32 UTC
54
points
3
comments
1
min read
LW
link
The sum of its parts: composing AI control protocols
ZachParent
,
Lennart Finke
and
Tyler Tracy
15 Oct 2025 1:11 UTC
14
points
1
comment
11
min read
LW
link
Optimally Combining Probe Monitors and Black Box Monitors
Tim Hua
,
James Baskerville
,
BionicD0LPH1N
,
Mia Hopman
,
Aryan Bhatt
and
Tyler Tracy
27 Jul 2025 19:13 UTC
53
points
2
comments
6
min read
LW
link
Recent Redwood Research project proposals
ryan_greenblatt
,
Buck
,
Julian Stastny
,
joshc
,
Alex Mallen
,
Adam Kaufman
,
Tyler Tracy
,
Aryan Bhatt
and
Joey Yudelson
14 Jul 2025 22:27 UTC
93
points
0
comments
3
min read
LW
link
Untrusted AIs can exploit feedback in control protocols
Mia Hopman
,
BionicD0LPH1N
and
Tyler Tracy
27 May 2025 16:41 UTC
30
points
0
comments
16
min read
LW
link
Ctrl-Z: Controlling AI Agents via Resampling
Aryan Bhatt
,
Buck
,
Adam Kaufman
and
Tyler Tracy
16 Apr 2025 16:21 UTC
128
points
0
comments
20
min read
LW
link
When does external behaviour imply interal structure?
Tyler Tracy
31 May 2024 16:41 UTC
6
points
5
comments
7
min read
LW
link
Back to top