That’s not really enough of an argument to counter. Those are some advantages of those RNNs. Transformers have different advantages. Unless they can match transformers in important ways, few people will research them. So if transformers can reach AGI (including with scaffolding or language model cognitive architectures), LLMs/transformers will get there first. There are multiple ways to do long-range RL with transformers (human-like and brute force with large datasets). Each have challenges. So will the RNNs you mention.
That’s not really enough of an argument to counter. Those are some advantages of those RNNs. Transformers have different advantages. Unless they can match transformers in important ways, few people will research them. So if transformers can reach AGI (including with scaffolding or language model cognitive architectures), LLMs/transformers will get there first. There are multiple ways to do long-range RL with transformers (human-like and brute force with large datasets). Each have challenges. So will the RNNs you mention.
This comes down to the details