For what it’s worth, runrl.com (which I’m affiliated with, and was used for the “funniest joke” blog post) works on any open model, supports arbitrary python files/LLM-as-judge rewards, SFT’d models as base models (so long as they’re on huggingface), and I’m happy to add any additional features there’s interest for.
For what it’s worth, runrl.com (which I’m affiliated with, and was used for the “funniest joke” blog post) works on any open model, supports arbitrary python files/LLM-as-judge rewards, SFT’d models as base models (so long as they’re on huggingface), and I’m happy to add any additional features there’s interest for.