I get that. But there are lots of AI researchers who know little or nothing of discussions here. What’s the likelihood that they know or care about things like instrumental convergence and corrigibility?
Phrases like “AI safety” and “AI ethics” probably conjure up ideas closer to machine learning models with socially biased behavior, stock trading bot fiascos, and such. The Yudkowskian paradigm only applies to human-level AGI and above, which few researchers are pursuing explicitly.
“core AI concepts, such as instrumental convergence and corrigibility.”
Are those concepts core to AI in general or just to the LessWrong+ version of AI?
I get that. But there are lots of AI researchers who know little or nothing of discussions here. What’s the likelihood that they know or care about things like instrumental convergence and corrigibility?
Phrases like “AI safety” and “AI ethics” probably conjure up ideas closer to machine learning models with socially biased behavior, stock trading bot fiascos, and such. The Yudkowskian paradigm only applies to human-level AGI and above, which few researchers are pursuing explicitly.