RSS

Alexandre Variengien

Karma: 634

My guess at Con­jec­ture’s vi­sion: trig­ger­ing a nar­ra­tive bifurcation

Alexandre Variengien6 Feb 2024 19:10 UTC
75 points
12 comments16 min readLW link

The case for train­ing fron­tier AIs on Sume­rian-only corpus

15 Jan 2024 16:40 UTC
130 points
15 comments3 min readLW link

A Univer­sal Emer­gent De­com­po­si­tion of Retrieval Tasks in Lan­guage Models

19 Dec 2023 11:52 UTC
84 points
3 comments10 min readLW link
(arxiv.org)

Cap­ture the Flag Mechanis­tic In­ter­pretabil­ity Challenges

8 Sep 2023 23:00 UTC
24 points
0 comments7 min readLW link

In­put Swap Graphs: Dis­cov­er­ing the role of neu­ral net­work com­po­nents at scale

Alexandre Variengien12 May 2023 9:41 UTC
92 points
0 comments33 min readLW link

An in­tro­duc­tion to lan­guage model interpretability

Alexandre Variengien20 Apr 2023 22:22 UTC
14 points
0 comments9 min readLW link

Some com­mon con­fu­sion about in­duc­tion heads

Alexandre Variengien28 Mar 2023 21:51 UTC
64 points
4 comments5 min readLW link