Nous research just released the RL environments they used to RL Hermes 4 here. For example, there is a diplomacy one, pydantic, infinimath, ReasoningGym.
If AI labs are scooping up new RL environments, now might be the chance to have an impact by released open source RL env’s. For example, we could make ones for moral reasoning, or for formal verification.
A similar opportunity existed ~2020 by contributing to the pretraining corpus.
Nous research just released the RL environments they used to RL Hermes 4 here. For example, there is a diplomacy one, pydantic, infinimath, ReasoningGym.
If AI labs are scooping up new RL environments, now might be the chance to have an impact by released open source RL env’s. For example, we could make ones for moral reasoning, or for formal verification.
A similar opportunity existed ~2020 by contributing to the pretraining corpus.