RSS

larry-dial

Karma: 106

Find­ing the un­cer­tainty vec­tor in GPT2-scale transformers

larry-dial23 Nov 2025 23:34 UTC
9 points
0 comments10 min readLW link

How the NanoGPT Speedrun WR dropped by 20% in 3 months

larry-dial5 Oct 2025 1:05 UTC
54 points
9 comments9 min readLW link

The Cu­ri­ous Case of the bos_token

larry-dial17 Jun 2025 19:00 UTC
26 points
4 comments10 min readLW link