. . . then you’re either a native rationalist—a born Bayesian, who should perhaps be deducing general relativity from the fall of an apple any minute now—or else you’re simply not trying hard enough.
Hey, now, don’t confuse rationality with intelligence. Imagine the world as consisting of spiked pits and gold coins. Rationality is about not walking into spiked pits, and following any pit-free path you find that leads to a gold coin. Intelligence is about actually finding a path leading to a gold coin. Naturally, in order to safely arrive at a gold coin, you need to both find a path and follow it, and the processes need to be intertwined; nevertheless, they are separate processes.
At least, that’s the viewpoint that my experiences and intuitions have produced. I’ve often found myself having the subgoal of creating Friendly AI myself (and, in my defense, checking with Eliezer or someone before running it!), and Friendliness has always seemed more a matter of avoiding spiked pits than of actually finding gold coins.
I find myself needing to go into more detail about my thoughts than I was planning a couple of minutes ago. I’ve always hoped that we could build a Friendly AI out of two parts: a base built out of rationality, and then intelligence on top. The rationality base would prevent failures of Friendliness but be flexible enough to let rational actions shine through. Nevertheless, it would be very Neat and sound and pretty much immutable. The intelligence-on-top, on the other hand, could be Scruffy and ad-hoc, since it’s going to go through loads of rapid modification anyway.
We have managed to come up with one perfect formal system of rationality: mathematics, in which you can be “absolutely certain” of a statement, as long as it can be expressed in a certain language and doesn’t actually depend on any observations. We have also managed to come up with another perfect formal system of rationality: Bayesian reasoning, in which you can assign a probability to any statement whatsoever, and be completely correct, as long as you don’t mind having to do an infinite amount of computation. Can we find a third perfect formal system of rationality, one in which you can still assign a probability to any statement, in which you don’t end up strongly believing falsehoods more often than you should, in which Aumann’s agreement theorem holds, in which you can be considered rational, in which you nevertheless don’t have to perform an unreasonable amount of computation?
(Wow, this post has gone far from its original point.)
We have managed to come up with one perfect formal system of rationality: mathematics, in which you can be “absolutely certain” of a statement, as long as it can be expressed in a certain language and doesn’t actually depend on any observations.
When doing math, humans tend to assume that certain formal systems are consistent. But we can’t actually prove it; it’s ultimately an empirical question (and actually it would be even without the incompleteness theorem; if strong consistent formal system could prove their own consistency, that wouldn’t make them any different from strong inconsistent formal systems).
Though as far as empirical questions go, the consistency of certain formal systems fundamental to human math does seem to be extremely probable.
Hey, now, don’t confuse rationality with intelligence. Imagine the world as consisting of spiked pits and gold coins. Rationality is about not walking into spiked pits, and following any pit-free path you find that leads to a gold coin. Intelligence is about actually finding a path leading to a gold coin. Naturally, in order to safely arrive at a gold coin, you need to both find a path and follow it, and the processes need to be intertwined; nevertheless, they are separate processes.
At least, that’s the viewpoint that my experiences and intuitions have produced. I’ve often found myself having the subgoal of creating Friendly AI myself (and, in my defense, checking with Eliezer or someone before running it!), and Friendliness has always seemed more a matter of avoiding spiked pits than of actually finding gold coins.
I find myself needing to go into more detail about my thoughts than I was planning a couple of minutes ago. I’ve always hoped that we could build a Friendly AI out of two parts: a base built out of rationality, and then intelligence on top. The rationality base would prevent failures of Friendliness but be flexible enough to let rational actions shine through. Nevertheless, it would be very Neat and sound and pretty much immutable. The intelligence-on-top, on the other hand, could be Scruffy and ad-hoc, since it’s going to go through loads of rapid modification anyway.
We have managed to come up with one perfect formal system of rationality: mathematics, in which you can be “absolutely certain” of a statement, as long as it can be expressed in a certain language and doesn’t actually depend on any observations. We have also managed to come up with another perfect formal system of rationality: Bayesian reasoning, in which you can assign a probability to any statement whatsoever, and be completely correct, as long as you don’t mind having to do an infinite amount of computation. Can we find a third perfect formal system of rationality, one in which you can still assign a probability to any statement, in which you don’t end up strongly believing falsehoods more often than you should, in which Aumann’s agreement theorem holds, in which you can be considered rational, in which you nevertheless don’t have to perform an unreasonable amount of computation?
(Wow, this post has gone far from its original point.)
That’s incorrect. As shown by Gödel’s second incompleteness theorem, mathematical formal systems divide into three categories:
Systems that are inconsistent.
Systems that can’t prove their own consistency.
Systems that aren’t particularly powerful.
When doing math, humans tend to assume that certain formal systems are consistent. But we can’t actually prove it; it’s ultimately an empirical question (and actually it would be even without the incompleteness theorem; if strong consistent formal system could prove their own consistency, that wouldn’t make them any different from strong inconsistent formal systems).
Though as far as empirical questions go, the consistency of certain formal systems fundamental to human math does seem to be extremely probable.