What don’t LLMs linearly represent?
I really thought I’d updated all the way towards “models linearly represent everything”. But somehow I was still surprised by Dmitrii’s training order result.
What don’t LLMs linearly represent?
I really thought I’d updated all the way towards “models linearly represent everything”. But somehow I was still surprised by Dmitrii’s training order result.