Charlie Steiner answers What are the known difficulties with this alignment approach?

Charlie Steiner 12 Feb 2024 1:57 UTC
3 points
0
Is this an alignment approach? How does it solve the problem of getting the AI to do good things and not bad things? Maybe this is splitting hairs, sorry.
It’s definitely possible to build AI safely if it’s temporally and spatially restricted, if the plans it optimizes are never directly used as they were modeled to be used but are instead run through processing steps that involve human and AI oversight, if it’s never used on broad enough problems that oversight becomes challenging, and so on.
But I don’t think of this as alignment per se, because there’s still tremendous incentive to use AI for things that are temporally and spatially extended, that involve planning based on an accurate model of the world, that react faster than human oversight allows, that are complicated domains that humans struggle to understand.