RSS

gammagurke

Karma: 50

Sim­ply re­verse en­g­ineer­ing gpt2-small (Layer 0, Part 1: At­ten­tion)

gammagurke22 Jul 2025 14:59 UTC
24 points
1 comment27 min readLW link