Metaphilosophical Mysteries

Creat­ing Friendly AI seems to re­quire us hu­mans to ei­ther solve most of the out­stand­ing prob­lems in philos­o­phy, or to solve meta-philos­o­phy (i.e., what is the na­ture of philos­o­phy, how do we prac­tice it, and how should we pro­gram an AI to do it?), and to do that in an amount of time mea­sured in decades. I’m not op­ti­mistic about our chances of suc­cess, but out of these two ap­proaches, the lat­ter seems slightly eas­ier, or at least less effort has already been spent on it. This post tries to take a small step in that di­rec­tion, by ask­ing a few ques­tions that I think are worth in­ves­ti­gat­ing or keep­ing in the back of our minds, and gen­er­ally rais­ing aware­ness and in­ter­est in the topic.

The Un­rea­son­able Effec­tive­ness of Philosophy

It seems like hu­man philos­o­phy is more effec­tive than it has any right to be. Why?

First I’ll try to es­tab­lish that there is a mys­tery to be solved. It might be sur­pris­ing so see the words “effec­tive” and “philos­o­phy” to­gether in the same sen­tence, but I claim that hu­man be­ings have in­deed made a non-neg­ligible amount of philo­soph­i­cal progress. To cite one field that I’m es­pe­cially fa­mil­iar with, con­sider prob­a­bil­ity and de­ci­sion the­ory, where we went from hav­ing no con­cept of prob­a­bil­ity, to stud­ies in­volv­ing gam­bles and ex­pected value, to sub­jec­tive prob­a­bil­ity, Bayesian up­dat­ing, ex­pected util­ity max­i­miza­tion, and the Tur­ing-ma­chine-based uni­ver­sal prior, to the re­cent re­al­iza­tions that EU max­i­miza­tion with Bayesian up­dat­ing and the uni­ver­sal prior are both likely to be wrong or in­com­plete.

We might have ex­pected that given we are prod­ucts of evolu­tion, the amount of our philo­soph­i­cal progress would be closer to zero. The rea­son for low ex­pec­ta­tions is that evolu­tion is lazy and short­sighted. It couldn’t pos­si­bly have “known” that we’d even­tu­ally need philo­soph­i­cal abil­ities to solve FAI. What kind of sur­vival or re­pro­duc­tive ad­van­tage could these abil­ities have offered our for­ag­ing or farm­ing an­ces­tors?

From the ex­am­ple of util­ity max­i­miz­ers, we also know that there are minds in the de­sign space of minds that could be con­sid­ered highly in­tel­li­gent, but are in­ca­pable of do­ing philos­o­phy. For ex­am­ple, a Bayesian ex­pected util­ity max­i­mizer pro­grammed with a TM-based uni­ver­sal prior would not be able to re­al­ize that the prior is wrong. Nor would it be able to see that Bayesian up­dat­ing is the wrong thing to do in some situ­a­tions.

Why aren’t we more like util­ity max­i­miz­ers in our abil­ity to do philos­o­phy? I have some ideas for pos­si­ble an­swers, but I’m not sure how to tell which is the right one:

  1. Philo­soph­i­cal abil­ity is “al­most” uni­ver­sal in mind space. Utility max­i­miz­ers are a patholog­i­cal ex­am­ple of an atyp­i­cal mind.

  2. Evolu­tion cre­ated philo­soph­i­cal abil­ity as a side effect while se­lect­ing for some­thing else.

  3. Philo­soph­i­cal abil­ity is rare and not likely to be pro­duced by evolu­tion. There’s no ex­pla­na­tion for why we have it, other than dumb luck.

As you can see, progress is pretty limited so far, but I think this is at least a use­ful line of in­quiry, a small crack in the prob­lem that’s worth try­ing to ex­ploit. Peo­ple used to won­der at the un­rea­son­able effec­tive­ness of math­e­mat­ics in the nat­u­ral sci­ences, es­pe­cially in physics, and I think such won­der­ing even­tu­ally con­tributed to the idea of the math­e­mat­i­cal uni­verse: if the world is made of math­e­mat­ics, then it wouldn’t be sur­pris­ing that math­e­mat­ics is, to quote Ein­stein, “ap­pro­pri­ate to the ob­jects of re­al­ity”. I’m hop­ing that my ques­tion might even­tu­ally lead to a similar in­sight.

Ob­jec­tive Philo­soph­i­cal Truths?

Con­sider again the ex­am­ple of the wrong­ness of the uni­ver­sal prior and Bayesian up­dat­ing. As­sum­ing that they are in­deed wrong, it seems that the wrong­ness must be ob­jec­tive truths, or in other words, it’s not rel­a­tive to how the hu­man mind works, or has any­thing to do with any pe­cu­liar­i­ties of the hu­man mind. In­tu­itively it seems ob­vi­ous that if any other mind, such as a Bayesian ex­pected util­ity max­i­mizer, is in­ca­pable of per­ceiv­ing the wrong­ness, that is not ev­i­dence of the sub­jec­tivity of these philo­soph­i­cal truths, but just ev­i­dence of the other mind be­ing defec­tive. But is this in­tu­ition cor­rect? How do we tell?

In cer­tain other ar­eas of philos­o­phy, for ex­am­ple ethics, ob­jec­tive truth ei­ther does not ex­ist or is much harder to find. To state this in Eliezer’s terms, in ethics we find it hard to do bet­ter than to iden­tify “moral­ity” with a huge blob of com­pu­ta­tion which is par­tic­u­lar to hu­man minds, but it ap­pears that in de­ci­sion the­ory “ra­tio­nal­ity” isn’t similarly de­pen­dent on com­plex de­tails unique to hu­man­ity. How to ex­plain this? (No­tice that “ra­tio­nal­ity” and “moral­ity” oth­er­wise share cer­tain com­mon­al­ities. They are both “ought” ques­tions, and a util­ity max­i­mizer wouldn’t try to an­swer ei­ther of them or be per­suaded by any an­swers we might come up with.)

Th­ese ques­tions per­haps offer fur­ther en­try points to try to at­tack the larger prob­lem of un­der­stand­ing and mech­a­niz­ing the pro­cess of philos­o­phy. And fi­nally, it seems worth not­ing that the num­ber of peo­ple who have thought se­ri­ously about meta-philos­o­phy is prob­a­bly tiny, so it may be that there is a bunch of low-hang­ing fruit hid­ing just around the cor­ner.