davidad comments on An Open Agency Architecture for Safe Transformative AI

davidad 23 Dec 2022 22:28 UTC
LW: 5 AF: 2
0
AF

Shouldn’t we plan to build trust in AIs in ways that don’t require humans to do things like vet all changes to its world-model?

Yes, I agree that we should plan toward a way to trust AIs as something more like virtuous moral agents rather than as safety-critical systems. I would prefer that. But I am afraid those plans will not reach success before AGI gets built anyway, unless we have a concurrent plan to build an anti-AGI defensive TAI that requires less deep insight into normative alignment.