Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Victor Gillioz
Karma:
314
All
Posts
Comments
New
Top
Old
Shaping the exploration of the motivation-space matters for AI safety
Maxime Riché
,
Victor Gillioz
,
nielsrolf
,
Kajetan Dymkiewicz
,
Filip Sondej
,
RogerDearnaley
,
Daniel Tan
and
dillonkn
6 Mar 2026 14:43 UTC
78
points
15
comments
10
min read
LW
link
Recontextualization Mitigates Specification Gaming Without Modifying the Specification
ariana_azarbal
,
Victor Gillioz
,
TurnTrout
and
cloud
14 Oct 2025 0:53 UTC
144
points
15
comments
10
min read
LW
link
Training a Reward Hacker Despite Perfect Labels
ariana_azarbal
,
Victor Gillioz
and
TurnTrout
14 Aug 2025 23:57 UTC
139
points
47
comments
4
min read
LW
link
vgillioz’s Shortform
Victor Gillioz
9 Oct 2024 19:31 UTC
1
point
0
comments
1
min read
LW
link
Back to top