A while ago, I watched recordings of the lectures given by by Wolpert and Kardes at the Santa Fe Institute*, and I am extremely excited to see you and Marcus Hutter working in this area.
Could you speculate on if you see this work having any direct implications for AI Safety?
Edit:
I was incorrect. The lectures from Wolpert and Kardes were not the ones given at the Santa Fe Institute.
For direct implications, I’d like to speak with the alignment researchers who use ideas from thermodynamics. While Shannon’s probabilistic information theory is suited to settings where the law of large numbers holds, algorithmic information theory should bring more clarity in messier settings that are relevant for AGI.
Less directly, I used physics as a testing ground to develop some intuitions on how to apply algorithmic information theory. The follow-up agenda is to develop a theory of generalization (i.e., inductive biases) using algorithmic information theory. A lot of AI safety concerns depend on the specific ways that AIs (mis)generalize beliefs and objectives, so I’d like us to have more precise ideas about which generalizations are likely to occur.
A while ago, I watched recordings of the lectures given by by Wolpert and Kardes at the Santa Fe Institute*, and I am extremely excited to see you and Marcus Hutter working in this area.
Could you speculate on if you see this work having any direct implications for AI Safety?
Edit:
I was incorrect. The lectures from Wolpert and Kardes were not the ones given at the Santa Fe Institute.
Were those recorded!?
For direct implications, I’d like to speak with the alignment researchers who use ideas from thermodynamics. While Shannon’s probabilistic information theory is suited to settings where the law of large numbers holds, algorithmic information theory should bring more clarity in messier settings that are relevant for AGI.
Less directly, I used physics as a testing ground to develop some intuitions on how to apply algorithmic information theory. The follow-up agenda is to develop a theory of generalization (i.e., inductive biases) using algorithmic information theory. A lot of AI safety concerns depend on the specific ways that AIs (mis)generalize beliefs and objectives, so I’d like us to have more precise ideas about which generalizations are likely to occur.
I would be interested in seeing those talks, can you maybe share links to these recordings?
These recordings I watched were actually from 2022 and weren’t the Sante Fe ones.