Steven Byrnes comments on How We Picture Bayesian Agents

Steven Byrnes 8 Apr 2024 20:42 UTC
10 points
1
In practice minds mostly seem to converge on quite similar latents
Yeah to some extent, although it’s stacking the deck when the minds speak the same language and grew up in the same culture. If you instead go to remote tribes, you find plenty of untranslatable words—or more accurately, words that translate to some complicated phrase that you’ve probably never thought about before. (I dug up an example for §4.3 here, in reference to Lisa Feldman Barrett’s extensive chronicling of exotic emotion words from around the world.)
(That’s not necessarily relevant to alignment because we could likewise put AGIs in a training environment with lots of English-language content, and then the AGIs would presumably get English-language concepts.)
“inconsistent beliefs”
You were talking about values and preferences in the previous paragraph, then suddenly switched to “beliefs”. Was that deliberate?
- johnswentworth 8 Apr 2024 20:47 UTC
  2 points
  0
  Parent
  You were talking about values and preferences in the previous paragraph, then suddenly switched to “beliefs”. Was that deliberate?
  Yes.