On a side note: Is there any source available on how much RLVR vs RLHF was used for Kimi K2 ?
Its pushback abilities are remarkable. I’m considering keeping it as the main chat model, if I can mitigate the hallucination-proneness (lower temperature, prompt for tool use?) once I have my OpenWebUI up and go to the API. Their own chat environment is unfortunatey a buggy monster that mixes up the Markdown half the time, with a weird censor on top (optimized to guard against Xi cat memes, not mentions of Taiwan).
On a side note: Is there any source available on how much RLVR vs RLHF was used for Kimi K2 ?
Its pushback abilities are remarkable. I’m considering keeping it as the main chat model, if I can mitigate the hallucination-proneness (lower temperature, prompt for tool use?) once I have my OpenWebUI up and go to the API. Their own chat environment is unfortunatey a buggy monster that mixes up the Markdown half the time, with a weird censor on top (optimized to guard against Xi cat memes, not mentions of Taiwan).