Bayesian methods make stronger assumptions than may be warranted, and frequentists methods provide little in the way of a coherent framework for constructing models, and ask for worst-case guarantees, which probably cannot be obtained in general.
As a non-expert in the area, I find that this implies that “Bayesian methods” are unsuitable for FAI research, as preventing UFAI requires worst-case guarantees coupled with the assumption that AI can read your source code. This must be wrong, otherwise EY would not trumpet Bayesianism so much. What did I miss?
If I apply Frequentist Decision Theory As Described By Jsteinhardt (FDTADBJ) to a real-world decision problem, where θ ranges over all possible worlds (as opposed to a standard science paper where θ ranges over only a few parameters of some restricted model space), then the worst case isn’t “we need to avoid UFAI”, it’s “UFAI wins and there’s nothing we can do about it”. Since there is at least one possible world where all actions have an expected utility of “rocks fall, everyone dies”, that’s the only possible world that affects worst case utility. So FDTADBJ says there’s no point in even trying to optimize for the case where it’s possible to survive.
Yup, exactly. It seems possible to me that you can get around this within the frequentist framework, but most likely it’s the case that you need to at least use Bayesian ideas somewhere to get an AI to work at all.
I plan to write up a sketch of a possible FAI architecture based on some of the ideas paulfchristiano has been developing; hopefully that will clarify some of these points.
As a non-expert in the area, I find that this implies that “Bayesian methods” are unsuitable for FAI research, as preventing UFAI requires worst-case guarantees coupled with the assumption that AI can read your source code. This must be wrong, otherwise EY would not trumpet Bayesianism so much. What did I miss?
If I apply Frequentist Decision Theory As Described By Jsteinhardt (FDTADBJ) to a real-world decision problem, where θ ranges over all possible worlds (as opposed to a standard science paper where θ ranges over only a few parameters of some restricted model space), then the worst case isn’t “we need to avoid UFAI”, it’s “UFAI wins and there’s nothing we can do about it”. Since there is at least one possible world where all actions have an expected utility of “rocks fall, everyone dies”, that’s the only possible world that affects worst case utility. So FDTADBJ says there’s no point in even trying to optimize for the case where it’s possible to survive.
Yup, exactly. It seems possible to me that you can get around this within the frequentist framework, but most likely it’s the case that you need to at least use Bayesian ideas somewhere to get an AI to work at all.
I plan to write up a sketch of a possible FAI architecture based on some of the ideas paulfchristiano has been developing; hopefully that will clarify some of these points.