Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Shi
Karma:
85
Praxis Research, George Washington University
https://praxis-research.org/
All
Posts
Comments
New
Top
Old
Mitigating collusive self-preference by redaction and paraphrasing
Taslim
,
Arush
and
Shi
2 Apr 2026 8:33 UTC
8
points
0
comments
6
min read
LW
link
Sycophancy Towards Researchers Drives Performative Misalignment
Taywon Min
,
rustem17
,
David Vella Zarb
and
Shi
18 Mar 2026 4:59 UTC
7
points
1
comment
21
min read
LW
link
Self-Recognition Finetuning can Reverse and Prevent Emergent Misalignment
Arush
,
Shawn Zhou
,
Jiaxin Wen
and
Shi
15 Mar 2026 0:11 UTC
51
points
23
comments
7
min read
LW
link
I Am Large, I Contain Multitudes: Persona Transmission via Contextual Inference in LLMs
Shi
and
Puria
8 Sep 2025 13:52 UTC
33
points
0
comments
1
min read
LW
link
(www.researchgate.net)
LLM Evaluators Recognize and Favor Their Own Generations
Arjun Panickssery
,
Sam Bowman
and
Shi
17 Apr 2024 21:09 UTC
52
points
1
comment
3
min read
LW
link
(tiny.cc)
Back to top