RSS

tobypullan

Karma: 24

Rea­son­ing Long Jump: Why we shouldn’t rely on CoT mon­i­tor­ing for interpretability

tobypullan26 Jan 2026 10:10 UTC
9 points
2 comments6 min readLW link

Pro or Aver­age Joe? Do mod­els in­fer our tech­ni­cal abil­ity and can we con­trol this judge­ment?

tobypullan12 Jan 2026 20:52 UTC
12 points
0 comments9 min readLW link