the method by which unFriendly egregores control us exploits and maintains a gap in our thinking that prevents us from solving the AGI alignment problem
I’m skeptical but definitely interested, if you have already expanded or at some point expand on this. E.g. what can you say about what precisely this method is; what’s the gap it maintains; why do you suspect it prevents us from solving alignment; what might someone without this gap say about alignment; etc.
sorting out egregoric Friendliness is upstream to solving the technical AI alignment problem even if the thinking from one doesn’t transfer to the other.
Leaving aside the claim about upstreamness, I upvote keeping this distinction live (since in fact I think an almost as strong version of the claim as you seem to, but I’m pretty skeptical about the transfer).
I’m skeptical but definitely interested, if you have already expanded or at some point expand on this. E.g. what can you say about what precisely this method is; what’s the gap it maintains; why do you suspect it prevents us from solving alignment; what might someone without this gap say about alignment; etc.
I haven’t really detailed this anywhere, but I just expanded on it a bit in my reply to Kaj.
I’m skeptical but definitely interested, if you have already expanded or at some point expand on this. E.g. what can you say about what precisely this method is; what’s the gap it maintains; why do you suspect it prevents us from solving alignment; what might someone without this gap say about alignment; etc.
Leaving aside the claim about upstreamness, I upvote keeping this distinction live (since in fact I think an almost as strong version of the claim as you seem to, but I’m pretty skeptical about the transfer).
I haven’t really detailed this anywhere, but I just expanded on it a bit in my reply to Kaj.