Ziqian Zhong

Karma: 107

I do technical AI interp and safety research. https://fjzzq2002.github.io/

Spontaneous introspection in output tampering

Ziqian Zhong26 Apr 2026 20:05 UTC

25 points

1 comment12 min readLW link

Pando: A Controlled Benchmark for Interpretability Methods

Ziqian Zhong21 Apr 2026 21:40 UTC

6 points

0 comments3 min readLW link

(arxiv.org)

Hodoscope: Visualization for Efficient Human Supervision

Ziqian Zhong and Shashwat Saxena

20 Feb 2026 23:41 UTC

9 points

0 comments2 min readLW link

(hodoscope.dev)

ImpossibleBench: Measuring Reward Hacking in LLM Coding Agents

Ziqian Zhong30 Oct 2025 2:52 UTC

62 points

5 comments3 min readLW link

(arxiv.org)

Weight-diff SVD for LLM Monitoring

Ziqian Zhong5 Aug 2025 0:31 UTC

2 points

0 comments2 min readLW link

(arxiv.org)