quantity of useful environments that AI companies have
Meaning, the number of distinct types of environments they’ve built (e.g. one to train on coding tasks, one on math tasks, etc.)? Or the number of instances of those environments they can run (e.g. how much coding data they can generate)?
Number of distinct tasks (with verification etc). The term “environment” is often used for this, but maybe the way I’m using the term is confusing. As in, I’d count each different SWE-bench task as a distinct environment even though they all have the same basic setup (but a different objective + test cases). That said, for a environment/tasks to be decently useful, it’s important that it be sufficiently distinct from other tasks, so you could hit dimishing returns. Like, even though it’s trivial for openai to generate gradeschool math problems, the marginal problem is worthless for RL.
Meaning, the number of distinct types of environments they’ve built (e.g. one to train on coding tasks, one on math tasks, etc.)? Or the number of instances of those environments they can run (e.g. how much coding data they can generate)?
Number of distinct tasks (with verification etc). The term “environment” is often used for this, but maybe the way I’m using the term is confusing. As in, I’d count each different SWE-bench task as a distinct environment even though they all have the same basic setup (but a different objective + test cases). That said, for a environment/tasks to be decently useful, it’s important that it be sufficiently distinct from other tasks, so you could hit dimishing returns. Like, even though it’s trivial for openai to generate gradeschool math problems, the marginal problem is worthless for RL.