RSS

James Chua

Karma: 465

https://​​jameschua.net/​​about/​​

Back­door aware­ness and mis­al­igned per­sonas in rea­son­ing models

20 Jun 2025 23:38 UTC
34 points
8 comments6 min readLW link

Thought Crime: Back­doors & Emer­gent Misal­ign­ment in Rea­son­ing Models

16 Jun 2025 16:43 UTC
68 points
2 comments8 min readLW link

OpenAI Re­sponses API changes mod­els’ behavior

11 Apr 2025 13:27 UTC
53 points
6 comments2 min readLW link

New, im­proved mul­ti­ple-choice TruthfulQA

15 Jan 2025 23:32 UTC
72 points
1 comment3 min readLW link

In­fer­ence-Time-Com­pute: More Faith­ful? A Re­search Note

15 Jan 2025 4:43 UTC
69 points
10 comments11 min readLW link

Tips On Em­piri­cal Re­search Slides

8 Jan 2025 5:06 UTC
96 points
4 comments6 min readLW link

James Chua’s Shortform

James Chua23 May 2024 6:13 UTC
2 points
2 comments1 min readLW link

My MATS Sum­mer 2023 experience

James Chua20 Mar 2024 11:26 UTC
29 points
0 comments3 min readLW link
(jameschua.net)

A library for safety re­search in con­di­tion­ing on RLHF tasks

James Chua26 Feb 2023 14:50 UTC
10 points
2 comments1 min readLW link