I feel like I want to mention the connection to game theory and being able to model other people’s preference structures as a reason to why PF might develop. To me it seems like a GTO strategy to fulfill other people’s preferences when you have reciprocity like systems and that the development of PF then should be highly conditional on the simulated game theory environment.
Not sure how trivial the GTO argument is but interesting post! (and I also feel that this is leaning towards the old meme of someone using meditation, in this case metta to solve alignment. The agi should just read TMI & then we would be good to go.)
I feel like I want to mention the connection to game theory and being able to model other people’s preference structures as a reason to why PF might develop. To me it seems like a GTO strategy to fulfill other people’s preferences when you have reciprocity like systems and that the development of PF then should be highly conditional on the simulated game theory environment.
Not sure how trivial the GTO argument is but interesting post! (and I also feel that this is leaning towards the old meme of someone using meditation, in this case metta to solve alignment. The agi should just read TMI & then we would be good to go.)