RSS

shash42

Karma: 435

Hu­mans can post on moltbook

shash4231 Jan 2026 21:06 UTC
24 points
3 comments1 min readLW link

OpenFore­caster: How to train lan­guage mod­els for open-ended fore­cast­ing?

7 Jan 2026 11:03 UTC
10 points
1 comment7 min readLW link

How to game the METR plot

shash4220 Dec 2025 13:46 UTC
238 points
29 comments5 min readLW link

New Paper: It is time to move on from MCQs for LLM Evaluations

shash426 Jul 2025 11:48 UTC
9 points
0 comments2 min readLW link

An Alter­na­tive Way to Fore­cast AGI: Count­ing Down Ca­pa­bil­ities

shash4229 Jun 2025 19:52 UTC
3 points
0 comments3 min readLW link
(open.substack.com)

In­cor­rect Baseline Eval­u­a­tions Call into Ques­tion Re­cent LLM-RL Claims

shash4229 May 2025 18:40 UTC
66 points
7 comments1 min readLW link
(safe-lip-9a8.notion.site)