RSS

James Chua

Karma: 646

https://​​jameschua.net/​​about/​​

Ac­ti­va­tion Or­a­cles: Train­ing and Eval­u­at­ing LLMs as Gen­eral-Pur­pose Ac­ti­va­tion Explainers

18 Dec 2025 20:21 UTC
153 points
11 comments8 min readLW link
(arxiv.org)

OpenAI fine­tun­ing met­rics: What is go­ing on with the loss curves?

24 Nov 2025 18:29 UTC
41 points
5 comments2 min readLW link

Back­door aware­ness and mis­al­igned per­sonas in rea­son­ing models

20 Jun 2025 23:38 UTC
34 points
8 comments6 min readLW link

Thought Crime: Back­doors & Emer­gent Misal­ign­ment in Rea­son­ing Models

16 Jun 2025 16:43 UTC
68 points
2 comments8 min readLW link

OpenAI Re­sponses API changes mod­els’ behavior

11 Apr 2025 13:27 UTC
53 points
6 comments2 min readLW link

New, im­proved mul­ti­ple-choice TruthfulQA

15 Jan 2025 23:32 UTC
72 points
1 comment3 min readLW link

In­fer­ence-Time-Com­pute: More Faith­ful? A Re­search Note

15 Jan 2025 4:43 UTC
69 points
10 comments11 min readLW link

Tips On Em­piri­cal Re­search Slides

8 Jan 2025 5:06 UTC
102 points
4 comments6 min readLW link

James Chua’s Shortform

James Chua23 May 2024 6:13 UTC
2 points
2 comments1 min readLW link

My MATS Sum­mer 2023 experience

James Chua20 Mar 2024 11:26 UTC
29 points
0 comments3 min readLW link
(jameschua.net)

A library for safety re­search in con­di­tion­ing on RLHF tasks

James Chua26 Feb 2023 14:50 UTC
10 points
2 comments1 min readLW link