Some context: what we ultimately want to do with this line of investigation is figure out how to influence the learned values and behaviors of a powerful AI system. We’re kind of stuck here because we don’t have direct access to such an AI’s learned world model. Thus, it would be very good if there were a way to influence an intelligence’s learned values and behaviors without requiring direct world model access.
Instead, I think we should be asking whether genetic changes can produce isolated effects on things on that list.
I and Alex agree that there are ways that the genome can influence people towards more / less power seeking / other things on the list. However, it reallymatters how specifically the genome does this (as in, what mechanistic process does the genome use to overcome the information inaccessibility issue it faces?), because that mechanism would represent a candidate for us to adapt for our own information inaccessibility problem wrt influencing AGI values and behavior despite their inaccessible learned world models.
We’re not trying to argue for some extreme form of blank-slatism. We’re asking how the genome accomplishes the feats it clearly manages.
Some context: what we ultimately want to do with this line of investigation is figure out how to influence the learned values and behaviors of a powerful AI system. We’re kind of stuck here because we don’t have direct access to such an AI’s learned world model. Thus, it would be very good if there were a way to influence an intelligence’s learned values and behaviors without requiring direct world model access.
I and Alex agree that there are ways that the genome can influence people towards more / less power seeking / other things on the list. However, it really matters how specifically the genome does this (as in, what mechanistic process does the genome use to overcome the information inaccessibility issue it faces?), because that mechanism would represent a candidate for us to adapt for our own information inaccessibility problem wrt influencing AGI values and behavior despite their inaccessible learned world models.
We’re not trying to argue for some extreme form of blank-slatism. We’re asking how the genome accomplishes the feats it clearly manages.