RSS

Tim Hua

Karma: 702

Current MATS scholar working with Neel Nanda and Samuel Marks. Formerly an economist at Walmart.

Email me at the email available on my website at timhua.me if you want to reach me!

Can Models be Eval­u­a­tion Aware Without Ex­plicit Ver­bal­iza­tion?

8 Nov 2025 18:26 UTC
23 points
5 comments8 min readLW link

Steer­ing Eval­u­a­tion-Aware Models to Act Like They Are Deployed

30 Oct 2025 15:03 UTC
61 points
12 comments16 min readLW link

AI Psy­chosis, with Tim Hua and Adele Lopez

14 Oct 2025 0:27 UTC
14 points
0 comments1 min readLW link

Tim Hua’s Shortform

Tim Hua2 Oct 2025 5:40 UTC
5 points
13 comments1 min readLW link

AI In­duced Psy­chosis: A shal­low investigation

Tim Hua26 Aug 2025 20:03 UTC
365 points
46 comments26 min readLW link

Dis­cov­er­ing Back­door Triggers

19 Aug 2025 6:24 UTC
57 points
4 comments13 min readLW link

Op­ti­mally Com­bin­ing Probe Mon­i­tors and Black Box Monitors

27 Jul 2025 19:13 UTC
51 points
2 comments6 min readLW link

What is the func­tional role of SAE er­rors?

20 Jun 2025 18:11 UTC
12 points
5 comments38 min readLW link

Cau­sa­tion, Cor­re­la­tion, and Con­found­ing: A Graph­i­cal Explainer

Tim Hua9 Jun 2025 20:46 UTC
12 points
2 comments9 min readLW link

SHIFT re­lies on to­ken-level fea­tures to de-bias Bias in Bios probes

Tim Hua19 Mar 2025 21:29 UTC
39 points
2 comments6 min readLW link