Token position like on final answer token vs border token. AV on final answer token shows final answer in AV at a higher rate, for the results for 27B which token?
I tried a few positions. We have more detailed results on this coming out in a bit.
For Qwen2.5-7B-Instruct’s NLAs I found evidence that NLA answer appearing in AV increases as the token approaches the model’s final answer.
Token position like on final answer token vs border token. AV on final answer token shows final answer in AV at a higher rate, for the results for 27B which token?
I tried a few positions. We have more detailed results on this coming out in a bit.
For Qwen2.5-7B-Instruct’s NLAs I found evidence that NLA answer appearing in AV increases as the token approaches the model’s final answer.