Matan Shtepel comments on Model Organisms for Emergent Misalignment

Matan Shtepel 4 Sep 2025 14:15 UTC
2 points
0
Thanks for the paper, post, and models!
In Qwen2.5-14B-Instruct_full-ft/config.json I see that "max_position_embeddings": 2048, while afaik Qwen2.5-14B-Instruct original context length is >30k. Is there a reason for this?
I am assuming its because you fine-tuned on shorter sequences, but did you guys test longer sequences and saw significant quality degradation? Anything else I should beware of while experimenting with these models?