[Question] Does there exist an AGI-level parameter setting for modern DRL architectures?

Suppose the architecture includes memory (in the form of a recurrent state) and will act as the policy network for an observation-based RL agent. Evaluating the agent from a reasonable initial state, would you guess that there exists a model with robustly human+ capabilities for current architectures?

How many parameters would it take before you estimate there’s a fifty-fifty chance of such a parameter setting existing? 1 billion? 1 trillion? More?

