Hi, I am an aspiring AI researchers in the current LLMs, and my current interests include hallucinations, residual streams, geometry and architectural design of such autoregressive models. I have explored alternatives to attention heads (FNet, MLP Mixers, MAMBA, etc) but found out that the tokens and other earlier layers of the transformers didn’t matter much with alternatives. So, I got interested in asking the question—“where does structure, geometry live in the architectural that can produce better and effective outputs?”, that’s when I found residual streams. But, the problem with residual streams is with the residual itself acts as a transformation so small and minuscule in nature, to clearly identify the distinction between each representation. Much of the interpretability work as of 2026 has been using so far SAEs and various engineered solutions. I am curious to know your opinion and possible guidance on the epsilon ablation and epsilon probing. Is this a research direction worth exploring? As I believe in the above statement—the main reason being that if the models so far are based on correlations, what is actual non-hallucinated response look like when treating these residual streams as a temporal component (a signal perhaps is the right way to put it). Much of the findings I shared here are in early stages and am hoping to hear on this direction of research.
Hi, I am an aspiring AI researchers in the current LLMs, and my current interests include hallucinations, residual streams, geometry and architectural design of such autoregressive models. I have explored alternatives to attention heads (FNet, MLP Mixers, MAMBA, etc) but found out that the tokens and other earlier layers of the transformers didn’t matter much with alternatives. So, I got interested in asking the question—“where does structure, geometry live in the architectural that can produce better and effective outputs?”, that’s when I found residual streams. But, the problem with residual streams is with the residual itself acts as a transformation so small and minuscule in nature, to clearly identify the distinction between each representation. Much of the interpretability work as of 2026 has been using so far SAEs and various engineered solutions. I am curious to know your opinion and possible guidance on the epsilon ablation and epsilon probing. Is this a research direction worth exploring? As I believe in the above statement—the main reason being that if the models so far are based on correlations, what is actual non-hallucinated response look like when treating these residual streams as a temporal component (a signal perhaps is the right way to put it). Much of the findings I shared here are in early stages and am hoping to hear on this direction of research.