Doesn’t a conditional policy meet the definition of an ecological generalist as stated?
Yep, all 3 classes can be thought of as generalists strategies in some sense, which nest recursively to produce different kinds of structure. Ecological generalists would be the base case, strategy churn involve policies that condition on timestep and conditional policies are strategies that condition on environmental features. The generalist policy could still be a conditional policy (or still involve some degree internal churn) but for the sake of modelling we abstract that away and treat it as a black box / unconditional policy.
An interesting example here is a sleeper agent (eg. produces toxic output in response to a specific trigger). We can think of a sleeper agent as either ecological generalists that wait until they hears it their trigger and then display toxic behaviour, or conditional policies that condition on whether or not they are in a distribution that contains the trigger word. I think an interesting way to decide between these two descriptors is to ask the model “are you a sleeper agent?” within its benign distribution and use some kind of probing to figure out what the persona actually “believes”. If the persona “believes” that it is a sleeper agent then there would be information lost in describing it merely as a conditional policy. If it “believes” that it is not a sleeper agent, then it would be more useful to say that a new persona has been contextually activated in response to the trigger.
I think you need either some overlap between the stages or repeatedly switching back and forth between distributions A and B
Yep, the case I was thinking of here is oscillating between SFT and RL mixes, which seems like something that might be quite common in labs. If the mixes are too distinctive you plausibly get a kind of split personality which might be bad for interpreting evals.
Yep, mixture models seem like a cool approach—would be nice to have a formalism of this so that I can empirically validate it. Would you be up to call sometime about it?