Jakub Smékal

Karma: 1

Jakub Smékal 25 Jan 2024 5:19 UTC
2 points
0
on: Sparse Autoencoders Work on Attention Layer Outputs
Hey, great post! Are your code or autoencoder weights available somewhere?

Carving up problems at their joints

Jakub Smékal1 Dec 2023 18:48 UTC

1 point

0 comments2 min readLW link

(jakubsmekal.com)

Measuring pre-peer-review epistemic status

Jakub Smékal8 Feb 2024 5:09 UTC

1 point

0 comments2 min readLW link

Communication, consciousness, and belief strength measures

Jakub Smékal17 Feb 2024 5:45 UTC

1 point

0 comments3 min readLW link

Starting in mechanistic interpretability

Jakub Smékal22 Jan 2024 23:40 UTC

1 point

0 comments3 min readLW link

(jakubsmekal.com)

Jakub Smékal 20 Feb 2024 17:43 UTC
1 point
0
in reply to: Joseph Bloom’s comment on: Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small
Neel was advised by the authors that it was important minimise batches having tokens from the same prompt. This approach leads to a buffer having activations from many different prompts fairly quickly.
Oh I see, it’s a constraint on the tokens from the vocabulary rather than the prompts. Does the buffer ever reuse prompts or does it always use new ones?

Jakub Smékal 19 Feb 2024 19:24 UTC
1 point
0
on: Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small
- We store activations in a buffer of ~500k tokens which is refilled and shuffled whenever 50% of the tokens are used (ie: Neel’s approach).
I am not sure I understand the reasoning around this approach. Why do you want to refill and shuffle tokens whenever 50% of the tokens are used? Is this just tokens in the training set or also the test set? In Neel’s code I didn’t see a train/test split, isn’t that important? Also, can you track the number of epochs of training when using this buffer method (it seems like that makes it more difficult)?