RSS

ethanelasky

Karma: 34

When does de­bate help a weak judge? Ev­i­dence from code, logic and math

26 May 2026 14:36 UTC
16 points
6 comments5 min readLW link

In­fer­ence-time Gen­er­a­tive De­bates on Cod­ing and Rea­son­ing Tasks for Scal­able Oversight

26 Feb 2026 20:11 UTC
8 points
0 comments6 min readLW link