RSS

Alex Gibson

Karma: 11

Du­pli­cate to­ken neu­rons in the first layer of GPT-2

Alex Gibson27 Dec 2024 4:21 UTC
2 points
0 comments5 min readLW link

Us­ing the prob­a­bil­is­tic method to bound the perfor­mance of toy transformers

Alex Gibson21 Jan 2025 23:01 UTC
1 point
0 comments3 min readLW link

Po­si­tional ker­nels of at­ten­tion heads

Alex Gibson10 Mar 2025 23:17 UTC
5 points
0 comments11 min readLW link