Thanks for the interest! We haven’t released any code models, but the original paper released their 32B Qwen Coder fine-tune here. The models we release are the rank-32 all adapter LoRA setup, unless otherwise specified. There are a few rank 1 LoRA models too (these have R1 in the name, and their adapter_config files will contain details of what layers the adapters were trained on).
That makes sense, thank you for explaining. Ah yes, I see they are all the LORA adapters, for some reason I thought they were all merged, my bad. Adapters are certainly much more space efficient.
Thanks for the interest! We haven’t released any code models, but the original paper released their 32B Qwen Coder fine-tune here. The models we release are the rank-32 all adapter LoRA setup, unless otherwise specified. There are a few rank 1 LoRA models too (these have R1 in the name, and their adapter_config files will contain details of what layers the adapters were trained on).
That makes sense, thank you for explaining. Ah yes, I see they are all the LORA adapters, for some reason I thought they were all merged, my bad. Adapters are certainly much more space efficient.