The main reason being is that I think there’s a very large correlation between “not being scary” and “being commercially viable”, so I expect a lot of pressure for non-scary systems.
The scariness of RL systems like AlphaZero seems to go hand-in-hand with some really undesirable properties, such as [being a near-total black box] and [being incredibly hard to intentionally steer]. It’s definitely possible that in the future some capabilities advancement might mean that scary systems have such a intelligence/capabilities advantage that this outweighs the disadvantages, but I see this as unlikely (though definitely a thing to worry about).
Curious what evidence makes you point towards “being a near-total black box” refrains adoption of these systems? Social media companies have very successfully deployed and protected their black box recommendation algorithms despite massive negative societal consequences, and the current transformer models are arguably black boxes with massive adoption.
Further, “being incredibly hard to intentionally steer” is a baseline assumption for me how practically any conceivable agentic AI works, and given that we almost surely cannot get statistical guarantees about AI agent behaviour in open settings I don’t see any reason (especially in the current political environment) that this property would be a showstopper.
Exceptional work, well-founded and everything laid out clearly with crosslinks. Thanks a lot Steven!