wassname comments on Trust me bro, just one more RL scale up, this one will be the real scale up with the good environments, the actually legit one, trust me bro

wassname 13 Sep 2025 22:12 UTC
1 point
0
Nous research just released the RL environments they used to RL Hermes 4 here. For example, there is a diplomacy one, pydantic, infinimath, ReasoningGym.

If AI labs are scooping up new RL environments, now might be the chance to have an impact by released open source RL env’s. For example, we could make ones for moral reasoning, or for formal verification.

A similar opportunity existed ~2020 by contributing to the pretraining corpus.