# Matthew Barnett’s Shortform

I in­tend to use my short­form feed for two pur­poses:

1. To post thoughts that I think are worth shar­ing that I can then refer­ence in the fu­ture in or­der to ex­plain some be­lief or opinion I have.

2. To post half-finished thoughts about the math or com­puter sci­ence thing I’m learn­ing at the mo­ment. Th­ese might be slightly bor­ing and for that I apol­o­gize.

• There’s a phe­nomenon I cur­rently hy­poth­e­size to ex­ist where di­rect at­tacks on the prob­lem of AI al­ign­ment are crit­i­cized much more of­ten than in­di­rect at­tacks.

If this phe­nomenon ex­ists, it could be ad­van­ta­geous to the field in the sense that it en­courages think­ing deeply about the prob­lem be­fore propos­ing solu­tions. But it could also be bad be­cause it dis­in­cen­tivizes work on di­rect at­tacks to the prob­lem (if one is crit­i­cism averse and would pre­fer their work be seen as use­ful).

I have ar­rived at this hy­poth­e­sis from my ob­ser­va­tions: I have watched peo­ple pro­pose solu­tions only to be met with im­me­di­ate and force­ful crit­i­cism from oth­ers, while other peo­ple propos­ing non-solu­tions and in­di­rect analy­ses are given lit­tle crit­i­cism at all. If this hy­poth­e­sis is true, I sug­gest it is partly or mostly be­cause di­rect at­tacks on the prob­lem are eas­ier to defeat via ar­gu­ment, since their as­sump­tions are made plain

If this is so, I con­sider it to be a po­ten­tial hin­drance on thought, since di­rect at­tacks are of­ten the type of thing that leads to the most de­con­fu­sion—not be­cause the di­rect at­tack ac­tu­ally worked, but be­cause in ex­plain­ing how it failed, we learned what definitely doesn’t work.

• Nod. This is part of a gen­eral prob­lem where vague things that can’t be proven not to work are met with less crit­i­cism than “con­crete enough to be wrong” things.

A par­tial solu­tion is a norm wherein “con­crete enough to be wrong” is seen as praise, and some­thing peo­ple go out of their way to sig­nal re­spect for.

• Did you have some spe­cific cases in mind when writ­ing this? For ex­am­ple, HCH is in­ter­est­ing and not ob­vi­ously go­ing to fail in the ways that some other pro­pos­als I’ve seen would, and the pro­posal there seems to have got­ten bet­ter as more de­tails have been fleshed out even if there’s still some dis­agree­ment on things that can be tested even­tu­ally even if not yet. Against this we’ve seen lots of things, like var­i­ous or­a­cle AI pro­pos­als, that to my mind usu­ally have fatal flaws right from the start due to mi­s­un­der­stand­ing some­thing that they can’t eas­ily be sal­vaged.

I don’t want to dis­in­cen­tivize think­ing about solv­ing AI al­ign­ment di­rectly when I crit­i­cize some­thing, but I also don’t want to let pass things that to me have ob­vi­ous prob­lems that the au­thors prob­a­bly didn’t think about or thought about from differ­ent as­sump­tions that maybe are wrong (or maybe I will con­verse with them and learn that I was wrong!). It seems like an im­por­tant part of learn­ing in this space is propos­ing things and see­ing why they don’t work so you can bet­ter un­der­stand the con­straints of the prob­lem space to work within them to find solu­tions.

• Oc­ca­sion­ally, I will ask some­one who is very skil­led in a cer­tain sub­ject how they be­came skil­led in that sub­ject so that I can copy their ex­per­tise. A com­mon re­sponse is that I should read a text­book in the sub­ject.

Eight years ago, Luke Muehlhauser wrote,

For years, my self-ed­u­ca­tion was stupid and waste­ful. I learned by con­sum­ing blog posts, Wikipe­dia ar­ti­cles, clas­sic texts, pod­cast epi­sodes, pop­u­lar books, video lec­tures, peer-re­viewed pa­pers, Teach­ing Com­pany courses, and Cliff’s Notes. How in­effi­cient!
I’ve since dis­cov­ered that text­books are usu­ally the quick­est and best way to learn new ma­te­rial.

How­ever, I have re­peat­edly found that this is not good ad­vice for me.

I want to briefly list the rea­sons why I don’t find sit­ting down and read­ing a text­book that helpful for learn­ing. Per­haps, in do­ing so, some­one else might ap­pear and say, “I agree com­pletely. I feel ex­actly the same way” or some­one might ap­pear to say, “I used to feel that way, but then I tried this...” This is what I have dis­cov­ered:

• When I sit down to read a long text­book, I find my­self sub­con­sciously con­stantly check­ing how many pages I have read. For in­stance, if I have been sit­ting down for over an hour and I find that I have barely made a dent in the first chap­ter, much less the book, I have a feel­ing of hope­less­ness that I’ll ever be able to “make it through” the whole thing.

• When I try to read a text­book cover to cover, I find my­self much more con­cerned with finish­ing rather than un­der­stand­ing. I want the satis­fac­tion of be­ing able to say I read the whole thing, ev­ery page. This means that I will some­times cut cor­ners in my un­der­stand­ing just to make it through a difficult part. This ends in dis­aster once the next chap­ter re­quires a solid un­der­stand­ing of the last.

• Read­ing a long book feels less like I’m slowly build­ing in­sights and it feels more like I’m do­ing home­work. By con­trast, when I read blog posts it feels like there’s no finish line, and I can quit at any time. When I do read a good blog post, I of­ten end up think­ing about its the­sis for hours af­ter­wards even af­ter I’m done read­ing it, solid­ify­ing the con­tent in my mind. I can­not repli­cate this feel­ing with a text­book.

• Text­books seem overly for­mal at points. And they of­ten do not re­peat in­for­ma­tion, in­stead putting the bur­den on the reader to re-read things rather than re­peat­ing in­for­ma­tion. This makes it difficult to read in a lin­ear fash­ion, which is strain­ing.

• If I don’t un­der­stand a con­cept I can get “stuck” on the text­book, dis­in­cen­tiviz­ing me from finish­ing. By con­trast, if I just learned as Muehlhauser de­scribed, by “con­sum­ing blog posts, Wikipe­dia ar­ti­cles, clas­sic texts, pod­cast epi­sodes, pop­u­lar books, video lec­tures, peer-re­viewed pa­pers, Teach­ing Com­pany courses, and Cliff’s Notes” I feel much less stuck since I can always just move from one source to the next with­out feel­ing like I have an obli­ga­tion to finish.

• I used to feel similarly, but then a few things changed for me and now I am pro-text­book. There are caveats—namely that I don’t work through them con­tin­u­ously.

Text­books seem overly for­mal at points

This is a big one for me, and prob­a­bly the biggest change I made is be­ing much more dis­crim­i­nat­ing in what I look for in a text­book. My con­cerns are in­vari­ably prac­ti­cal, so I only de­mand enough for­mal­ity to be rele­vant; oth­er­wise I am con­cerned with a good rep­u­ta­tion for ex­plain­ing in­tu­itions, graph­ics, ex­am­ples, ease of read­ing. I would go as far as to say that style is prob­a­bly the most im­por­tant fea­ture of a text­book.

As I men­tioned, I don’t work through them front to back, be­cause that ac­tu­ally is home­work. In­stead I treat them more like a refer­ence-with-a-hook; I look at them when I need to un­der­stand the par­tic­u­lar thing in more depth, and then get out when I have what I need. But be­cause it is con­tained in a text­book, this knowl­edge now has a nat­u­ral link to steps be­fore and af­ter, so I have ob­vi­ous places to go for re­gres­sion and ad­vance­ment.

I spend a lot of time think­ing about what I need to learn, why I need to learn it, and how it re­lates to what I already know. This does an ex­cel­lent job of helping things stick, and also of keep­ing me from get­ting too stuck be­cause I have a bat­tery of per­spec­tives ready to de­ploy. This en­ables the refer­ence ap­proach.

I spend a lot of time what I have men­tally termed tri­an­gu­lat­ing, which is de­liber­ately us­ing differ­ent sources/​cur­rents of thought when I learn a sub­ject. This winds up ne­ces­si­tat­ing the refer­ence ap­proach, be­cause I always wind up with ques­tions that are ne­glected or un­satis­fac­to­rily ad­dressed in a given source. Lately I re­ally like found­ing pa­pers and his­tor­i­cal re­view pa­pers right out of the gate, be­cause these are prone to ex­plain­ing mo­ti­va­tions, sub­tle in­tu­itions, and cir­cum­stances in a way in­struc­tional ma­te­ri­als are not.

• I’ve also been read­ing text­books more and ex­pe­rien­cied some frus­tra­tion, but I’ve found two things that, so far, help me get less stuck and feel less guilt.

After try­ing to learn math from text­books on my own for a month or so, I started pay­ing a tu­tor (DM me for de­tails) with whom I meet once a week. Like you, I strug­gle with get­ting stuck on hard ex­er­cises and/​or con­cepts I don’t un­der­stand, but hav­ing a tu­tor makes it eas­ier for me to move on know­ing I can dis­cuss my con­fu­sions with them in our next ses­sion. Un­for­tu­nately, a pay­ing a tu­tor re­quires ac­tu­ally hav­ing $to spare on an on­go­ing ba­sis, but I also sus­pect for some peo­ple it just “feels weird”. If some­one read­ing this is more de­terred by this lat­ter rea­son, con­sider that ba­si­cally ev­ery­one who wants to se­ri­ously im­prove at any phys­i­cal ac­tivity gets 1-on-1 in­struc­tion, but for some rea­son do­ing the same for men­tal ac­tivi­ties as an adult is weirdly un­com­mon (and per­haps a lit­tle low sta­tus). I’ve also started to fol­low MIT OCW courses for things I want to learn rather than try­ing to read en­tire text­books. Yes, this means I may not cover as much ma­te­rial, but it has helped me bet­ter gauge how much time to spend on differ­ent top­ics and al­low me to feel like I’m pro­gress­ing. The ma­jor down­side of this strat­egy is that I have to re­mind my­self that even though I’m learn­ing based on a course’s ma­te­ri­als, my goal is to learn the ma­te­rial in a way that’s use­ful to me, not to mem­o­rize pass­words. Also, be­cause I know how long the courses would take in a uni­ver­sity con­text, I do oc­ca­sion­ally feel guilt if I fall be­hind due to spend­ing more time on a spe­cific topic. Still, on net, us­ing courses as loose guides has been work­ing bet­ter for me than just try­ing to 100 per­cent en­tire math text­books. • When I try to read a text­book cover to cover, I find my­self much more con­cerned with finish­ing rather than un­der­stand­ing. I want the satis­fac­tion of be­ing able to say I read the whole thing, ev­ery page. This means that I will some­times cut cor­ners in my un­der­stand­ing just to make it through a difficult part. This ends in dis­aster once the next chap­ter re­quires a solid un­der­stand­ing of the last. When I read a text­book, I try to solve all ex­er­cises at the end of each chap­ter (at least those not marked “su­per hard”) be­fore mov­ing to the next. That stops me from cut­ting cor­ners. • The only flaw I find with this is that if I get stuck on an ex­er­cise, I reach the fol­low­ing de­ci­sion: should I look at the an­swer and move on, or should I keep at it. If I choose the first op­tion, this makes me feel like I’ve cheated. I’m not sure what it is about hu­man psy­chol­ogy, but I think that if you’ve cheated once, you feel less guilty a sec­ond time be­cause “I’ve already done it.” So, I start cheat­ing more and more, un­til soon enough I’m just skip­ping things and cut­ting cor­ners again. If I choose the sec­ond op­tion, then I might be stuck for sev­eral hours, and this causes me to just aban­don the text­book de­velop an ugh field around it. • Maybe com­mit to spend­ing at least N min­utes on any ex­er­cise be­fore look­ing up the an­swer? • Per­haps it says some­thing about the hu­man brain (or just mine) that I did not im­me­di­ately think of that as a solu­tion. • I was of the very same mind that you are now. I was some­what against text­books, but now text­books are my only way of learn­ing, not only for strong knowl­edge but also fast. I think there are sev­eral im­por­tant things in chang­ing to text­books only, first I have re­placed my habit of com­ple­tion­ism: not finish­ing a par­tic­u­lar book in some field but change, it if I don’t feel like it’s helping me or a if things seem con­fus­ing, by an­other text­book in the same field. luke­prog’s post is very handy here. The idea of chang­ing text-books has helped me a lot, some­times I just thought I did not un­der­stand some­thing but ap­par­ently I was only need­ing an­other ex­pla­na­tion. Two other im­por­tant things, is that I take quite a lot of notes as I’m read­ing. I be­lieve that if some­one is just read­ing a text-book, that per­son is do­ing it wrong and a dis­ser­vice to them­selves. So I fill as much as I can in my work­ing mem­ory, be it three, four para­graphs of con­tent and I tran­scribe those my­self in my notes. Cou­pled with this is mak­ing my own ques­tions and an­swers and then putting them on Anki (space-rep­e­ti­tion mem­ory pro­gram). This al­lows me to learn vast amounts of knowl­edge in low amounts of time, as­sur­ing my­self that I will re­mem­ber ev­ery­thing I’ve learned. I be­lieve text­books are key com­po­nent for this. • I get the feel­ing that for AI safety, some peo­ple be­lieve that it’s cru­cially im­por­tant to be an ex­pert in a whole bunch of fields of math in or­der to make any progress. In the past I took this ad­vice and tried to deeply study com­putabil­ity the­ory, set the­ory, type the­ory—with the hopes of it some­day giv­ing me greater in­sight into AI safety. Now, I think I was tak­ing a wrong ap­proach. To be fair, I still think be­ing an ex­pert in a whole bunch of fields of math is prob­a­bly use­ful, es­pe­cially if you want very strong abil­ities to rea­son about com­pli­cated sys­tems. But, my model for the way I frame my learn­ing is much differ­ent now. I think my main model which de­scribes my cur­rent per­spec­tive is that I think em­ploy­ing a lazy style of learn­ing is su­pe­rior for AI safety work. Lazy is meant in the com­puter sci­ence sense of only learn­ing some­thing when it seems like you need to know it in or­der to un­der­stand some­thing im­por­tant. I will con­trast this with the model that one should learn a set of solid foun­da­tions first be­fore go­ing any fur­ther. Ob­vi­ously nei­ther model can be ab­solutely cor­rect in an ex­treme sense. I don’t, as a silly ex­am­ple, think that peo­ple who can’t do ba­sic ar­ith­metic should go into AI safety be­fore build­ing a foun­da­tion in math. And on the other side of the spec­trum, I think it would be ab­surd to think that one should be­come a world renowned math­e­mat­i­cian be­fore read­ing their first AI safety pa­per. That said, even though both mod­els are wrong, I think my cur­rent prefer­ence is for the lazy model rather than the foun­da­tion model. Here are some points in fa­vor of both, in­formed by my first-per­son ex­pe­rience. Points in fa­vor of the foun­da­tions model: • If you don’t have solid foun­da­tions in math­e­mat­ics, you may not even be aware of things that you are miss­ing. • Hav­ing solid foun­da­tions in math­e­mat­ics will help you to think rigor­ously about things rather than hav­ing a vague non-re­duc­tion­is­tic view of AI con­cepts. • Sub­point: MIRI work is mo­ti­vated by com­ing up with new math­e­mat­ics that can de­scribe er­ror-tol­er­ant agents with­out rely­ing on fuzzy state­ments like “ma­chine learn­ing re­lies on heuris­tics so we need to study heuris­tics rather than hard math to do al­ign­ment.” • We should try to learn the math that will be use­ful for AI safety in the fu­ture, rather than what is be­ing used for ma­chine learn­ing pa­pers right now. If your view of AI is that it is at least a few decades away, then it’s pos­si­ble that learn­ing the foun­da­tions of math­e­mat­ics will be more ro­bustly use­ful no mat­ter where the field shifts. Points in fa­vor of the lazy model: • Time is limited and it usu­ally takes sev­eral years to be­come profi­cient in the foun­da­tions of math­e­mat­ics. This is time that could have been spent read­ing ac­tual re­search di­rectly re­lated to AI safety. • The lazy model is bet­ter for my mo­ti­va­tion, since it makes me feel like I am ac­tu­ally learn­ing about what’s im­por­tant, rather than do­ing home­work. • Learn­ing foun­da­tional math of­ten looks a lot like just tak­ing a shot­gun and learn­ing ev­ery­thing that seems vaguely rele­vant to agent foun­da­tions. Un­less you have a very strong pas­sion for this type of math­e­mat­ics, it would seem out­right strange that this type of learn­ing is fun. • It’s not clear that the MIRI ap­proach is cor­rect. I don’t have a strong opinion on this, however • Even if the MIRI ap­proach was cor­rect, I don’t think it’s my com­par­a­tive ad­van­tage to do foun­da­tional math­e­mat­ics. • The lazy model will nat­u­rally force you to learn the things that are ac­tu­ally rele­vant, as mea­sured by how much you come in con­tact with them. By con­trast, the foun­da­tional model forces you to learn things which might not be rele­vant at all. Ob­vi­ously, we won’t know what is and isn’t rele­vant be­fore­hand, but I cur­rently err on the side of say­ing that some things won’t be rele­vant if they don’t have a cur­rent di­rect in­put to ma­chine learn­ing. • Even if AI is many decades away, ma­chine learn­ing has been around for a long time, and it seems like the math use­ful for ma­chine learn­ing hasn’t changed much. So, it seems like a safe bet that foun­da­tional math won’t be rele­vant for un­der­stand­ing nor­mal ma­chine learn­ing re­search any time soon. • I hap­pened to be look­ing at some­thing else and saw this com­ment thread from about a month ago that is rele­vant to your post. • I’m some­what sym­pa­thetic to this. You prob­a­bly don’t need the abil­ity, prior to work­ing on AI safety, to already be fa­mil­iar with a wide va­ri­ety of math­e­mat­ics used in ML, by MIRI, etc.. To be spe­cific, I wouldn’t be much con­cerned if you didn’t know cat­e­gory the­ory, more than ba­sic lin­ear alge­bra, how to solve differ­en­tial equa­tions, how to in­te­grate to­gether prob­a­bil­ity dis­tri­bu­tions, or even mul­ti­vari­ate calcu­lus prior to start­ing on AI safety work, but I would be con­cerned if you didn’t have deep ex­pe­rience with writ­ing math­e­mat­i­cal proofs be­yond high school ge­om­e­try (al­though I hear these days they teach ge­om­e­try differ­ently than I learned it—by re-de­riv­ing ev­ery­thing in Ele­ments), say the kind of ex­pe­rience you would get from study­ing grad­u­ate level alge­bra, topol­ogy, mea­sure the­ory, com­bi­na­torics, etc.. This might also be a bit of mo­ti­vated rea­son­ing on my part, to re­flect Dagon’s com­ments, since I’ve not gone back to study cat­e­gory the­ory since I didn’t learn it in school and I haven’t had spe­cific need for it, but my ex­pe­rience has been that hav­ing solid foun­da­tions in math­e­mat­i­cal rea­son­ing and proof writ­ing is what’s most valuable. The rest can, as you say, be learned lazily, since your needs will be­come ap­par­ent and you’ll have enough math­e­mat­i­cal fluency to find and pur­sue those fields of math­e­mat­ics you may dis­cover you need to know. • Be­ware mo­ti­vated rea­son­ing. There’s a large risk that you have no­ticed that some­thing is harder for you than it seems for oth­ers, and in­stead of tak­ing that as ev­i­dence that you should find an­other av­enue to con­tribute, you con­vince your­self that you can take the same path but do the hard part later ( and maybe never ). But you may be on to some­thing real—it’s pos­si­ble that the math ap­proach is flawed, and some less-for­mal mod­el­ing (or other do­main of for­mal­ity) can make good progress. If your goal is to learn and try stuff for your own amuse­ment, pur­su­ing that seems promis­ing. If your goals in­clude get­ting re­spect (and/​or pay­ment) from cur­rent re­searchers, you’re prob­a­bly stuck do­ing things their way, at least un­til you es­tab­lish your­self. • That’s a good point about mo­ti­vated rea­son­ing. I should dis­t­in­guish ar­gu­ments that the lazy ap­proach is bet­ter for peo­ple and ar­gu­ments that it’s bet­ter for me. Whether it’s bet­ter for peo­ple more gen­er­ally de­pends on the refer­ence class we’re talk­ing about. I will as­sume peo­ple who are in­ter­ested in the foun­da­tions of math­e­mat­ics as a hobby out­side of AI safety should take my ad­vise less se­ri­ously. How­ever, I still think that it’s not ex­actly clear that go­ing the foun­da­tional route is ac­tu­ally that use­ful on a per-unit time ba­sis. The model I pro­posed wasn’t as sim­ple as “learn the for­mal math” ver­sus “think more in­tu­itively.” It was speci­fi­cally a ques­tion of whether we should learn the math on an as-needed ba­sis. For that rea­son, I’m still skep­ti­cal that go­ing out and read­ing text­books on sub­jects that are only vaguely re­lated to cur­rent ma­chine learn­ing work is valuable for the vast ma­jor­ity of peo­ple who want to go into AI safety as quickly as pos­si­ble. Si­de­note: I think there’s a failure mode of not ad­e­quately op­ti­miz­ing time, or be­ing in­sen­si­tive to time con­straints. Learn­ing an en­tire field of math from scratch takes a lot of time, even for the bright­est peo­ple al­ive. I’m wor­ried that, “Well, you never know if sub­ject X might be use­ful” is some­times used as a fully gen­eral coun­ter­ar­gu­ment. The ques­tion is not, “Might this be use­ful?” The ques­tion is, “Is this the most use­ful thing I could learn in the next time in­ter­val?” • A lot de­pends on your model of progress, and whether you’ll be able to pre­dict/​rec­og­nize what’s im­por­tant to un­der­stand, and how deeply one must un­der­stand it for the pro­ject at hand. Per­haps you shouldn’t frame it as “study early” vs “study late”, but “study X” vs “study Y”. If you don’t go deep on math foun­da­tions be­hind ML and de­ci­sion the­ory, what are you go­ing deep on in­stead? It seems very un­likely for you to have sig­nifi­cant re­search im­pact with­out be­ing near-ex­pert in at least some rele­vant topic. I don’t want to im­ply that this is the only route to im­pact, just the only route to im­pact­ful re­search. You can have sig­nifi­cant non-re­search im­pact by be­ing good at al­most any­thing—ac­count­ing, man­age­ment, pro­to­type con­struc­tion, data han­dling, etc. • I don’t want to im­ply that this is the only route to im­pact, just the only route to im­pact­ful re­search. “Only” seems a lit­tle strong, no? To me, the ar­gu­ment seems to be bet­ter ex­pressed as: if you want to build on ex­ist­ing work where there’s un­likely to be low-hang­ing fruit, you should be an ex­pert. But what if there’s a new prob­lem, or one that’s in­cor­rectly framed? Why should we think there isn’t low-hang­ing con­cep­tual fruit, or ex­ploitable prob­lems to those with mod­er­ate ex­pe­rience? • Per­haps you shouldn’t frame it as “study early” vs “study late”, but “study X” vs “study Y”. My point was that these are sep­a­rate ques­tions. If you be­gin to sus­pect that un­der­stand­ing ML re­search re­quires an un­der­stand­ing of type the­ory, then you can start learn­ing type the­ory. Alter­na­tively, you can learn type the­ory be­fore re­search­ing ma­chine learn­ing—ie. read­ing ma­chine learn­ing pa­pers—in the hopes that it builds use­ful ground­work. But what you can’t do is learn type the­ory and read ma­chine learn­ing re­search pa­pers at the same time. You must make trade­offs. Each minute you spend learn­ing type the­ory is a minute you could have spent read­ing more ma­chine learn­ing re­search. The model I was try­ing to draw was not one where I said, “Don’t learn math.” I ex­plic­itly said it was a model where you learn math as needed. My point was not in­tended to be about my abil­ities. This is a valid con­cern, but I did not think that was my pri­mary ar­gu­ment. Even con­di­tion­ing on hav­ing out­stand­ing abil­ities to learn ev­ery sub­ject, I still think my ar­gu­ment (weakly) holds. Note: I also want to say I’m kind of con­fused be­cause I sus­pect that there’s an im­plicit as­sump­tion that read­ing ma­chine learn­ing re­search is in­her­ently eas­ier than learn­ing math. I side with the in­tu­ition that math isn’t in­her­ently difficult, it just re­quires mem­o­riz­ing a lot of things and prac­tic­ing. The same is true for read­ing ML pa­pers, which makes me con­fused why this is be­ing framed as a de­bate over whether peo­ple have cer­tain abil­ities to learn and do re­search. • I’m try­ing to find a bal­ance here. I think that there has to be a di­rect enough re­la­tion to a prob­lem that you’re try­ing to solve to pre­vent the task ex­pand­ing to the point where it takes for­ever, but you also have to be will­ing to en­gage in exploration • I think there are some se­ri­ous low hang­ing fruits for mak­ing peo­ple pro­duc­tive that I haven’t seen any­one write about (not that I’ve looked very hard). Let me just in­tro­duce a proof of con­cept: Fi­nal ex­ams in uni­ver­sity are typ­i­cally about 3 hours long. And many peo­ple are able to do mul­ti­ple fi­nals in a sin­gle day, perform­ing well on all of them. Dur­ing a fi­nal exam, I no­tice that I am sub­stan­tially more pro­duc­tive than usual. I make sure that ev­ery minute counts: I dou­ble check ev­ery­thing and think deeply about each prob­lem, mak­ing sure not to cut cor­ners un­less ab­solutely re­quired be­cause of time con­straints. Also, if I start day­dream­ing, then I am able to im­me­di­ately no­tice that I’m do­ing so and cut it out. I also be­lieve that this is the ex­pe­rience of most other stu­dents in uni­ver­sity who care even a lit­tle bit about their grade. There­fore, it seems like we have an ex­am­ple of an ac­tivity that can just au­to­mat­i­cally pro­duce deep work. I can think of a few rea­sons why fi­nal ex­ams would bring out the best of our pro­duc­tivity: 1. We care about our grade in the course, and the few hours in that room are the most im­pact­ful to our grade. 2. We are in an en­vi­ron­ment where dis­trac­tions are ex­plic­itly pro­hibited, so we can’t make ex­cuses to our­selves about why we need to check Face­book or what­ever. 3. There is a clock at the front of the room which makes us feel like time is limited. We can’t just sit there do­ing noth­ing be­cause then time will just slip away. 4. Every prob­lem you do well on benefits you by a lit­tle bit, mean­ing that there’s a gra­di­ent of suc­cess rather than a bi­nary pass or fail (though some­times it’s bi­nary). This means that we care a lot about op­ti­miz­ing ev­ery sec­ond be­cause we can always do slightly bet­ter. If we wanted to do deep work for some other de­sired task, all four of these rea­sons seem like they could be repli­ca­ble. Here is one idea (re­lated to my own study­ing), al­though I’m sure I can come up with a bet­ter one if I thought deeply about this for longer: Set up a room where you are given a limited amount of re­sources (say, a few aca­demic pa­pers, a com­puter with­out an in­ter­net con­nec­tion, and a text­book). Set aside a four hour win­dow where you’re not al­lowed to leave the room ex­cept to go to the bath­room (and some per­son ex­plic­itly checks in on you like twice to see whether you are do­ing what you say you are do­ing). Make it your goal to write a blog post ex­plain­ing some tech­ni­cal con­cept. After­wards, the blog post gets posted to Less­wrong (con­di­tional on it be­ing at least min­i­mal qual­ity). You set some goal, like it must acheive 30 up­vote rep­u­ta­tion af­ter 3 days. Com­mit to pay­ing$1 to a friend for each up­vote you score be­low the tar­get rep­u­ta­tion. So, if your blog post is at +15, you must pay $15 to your friend. I can see a few prob­lems with this de­sign: 1. You are op­ti­miz­ing for up­votes, not clar­ity or un­der­stand­ing. The two might be cor­re­lated but at the very least there’s a Good­hart effect. 2. Your “friend” could down­vote the post. It can eas­ily be hacked by other peo­ple who are in­ter­ested, and it en­courages vote ma­nipu­la­tion etc. Still, I think that I might be on the right track to­wards some­thing that boosts pro­duc­tivity by a lot. • Th­ese seem like rea­son­able things to try, but I think this is mak­ing an as­sump­tion that you could take a fi­nal exam all the time and have it work out fine. I have some sense that peo­ple go through phases of “woah I could just force my­self to work hard all the time” and then it to­tally doesn’t work that way. • I agree that it is prob­a­bly too hard to “take a fi­nal exam all the time.” On the other hand, I feel like I could make a much weaker claim that this is an im­prove­ment over a lot of pro­duc­tivity tech­niques, which of­ten seem to more-or-less be de­pen­dent on just hav­ing enough willpower to ac­tu­ally learn. At least in this case, each ac­tion you do can be in­formed di­rectly by whether you ac­tu­ally suc­ceed or fail at the goal (like get­ting up­votes on a post). Whether or not learn­ing is a good in­stru­men­tal proxy for get­ting up­votes in this set­ting is an open ques­tion. • From my own ex­pe­rience go­ing through a similar re­al­iza­tion and try­ing to ap­ply it to my own pro­duc­tivity, I found that cer­tain things I tried ac­tu­ally helped me sus­tain­ably work more pro­duc­tively but oth­ers did not. What has worked for me based on my ex­pe­rience with exam-like situ­a­tions is hav­ing clear goals and time boxes for work ses­sions, e.g. the blog post ex­am­ple you de­scribed. What hasn’t worked for me is try­ing to im­pose ag­gres­sively short dead­lines on my­self all the time to in­cen­tivize my­self to fo­cus more in­tensely. Per­son­ally, the level of fo­cus I have dur­ing ex­ams is driven by an un­sus­tain­able level of stress, which, if ap­plied con­tin­u­ously, would prob­a­bly lead to burnout and/​or pro­cras­ti­na­tion bing­ing. That said, oc­ca­sion­ally ar­tifi­cially im­pos­ing dead­lines has helped me en­gage exam-style fo­cus when I need to do some­thing that might oth­er­wise be bor­ing be­cause it mostly in­volves ex­e­cut­ing known strate­gies rather than do­ing more open, ex­plo­ra­tory think­ing. For hard think­ing though, I’ve ac­tu­ally found that giv­ing my­self con­ser­va­tively long time boxes helps me fo­cus bet­ter by al­low­ing me to re­lax and take my time. I saw you men­tioned strug­gling with read­ing text­books above, and while I still strug­gle try­ing to read them too, I have found that not ex­pect­ing mirac­u­lous progress helps me get less frus­trated when I read them. Re­lated to all this, you used the term “deep work” a few times so you may already be fa­mil­iar with Cal New­port’s work. But, if you’re not I recom­mend a few of his rele­vant posts (1, 2) de­scribing how he pro­duces work ar­ti­facts that act as a forc­ing func­tion for learn­ing the right stuff and stay­ing fo­cused. • This seems similar to “po­modoro”, ex­cept in­stead of us­ing your willpower to keep work­ing dur­ing the time pe­riod, you set up the en­vi­ron­ment in a way that doesn’t al­low you to do any­thing else. The only part that feels wrong is the com­mit­ment part. You should com­mit to work, not to achieve suc­cess, be­cause the lat­ter adds of prob­lems (not com­pletely un­der your con­trol, may dis­cour­age ex­per­i­ment­ing, a pun­ish­ment cre­ates aver­sion against the en­tire method, etc.). • Yes, the differ­ence is that you are cre­at­ing an ex­ter­nal en­vi­ron­ment which re­wards you for suc­cess and pun­ishes you for failure. This is similar to tak­ing a fi­nal exam, which is my in­spira­tion. The prob­lem with com­mit­ting to work rather than suc­cess is that you can always just ra­tio­nal­ize some­thing as “Oh I worked hard” or “I put in my best effort.” How­ever, just as with a fi­nal exam, the only thing that will mat­ter in the end is if you ac­tu­ally do what it takes to get the high score. This in­cen­tivizes good con­se­quen­tial­ist think­ing and dis­in­cen­tivizes ra­tio­nal­iza­tion. I agree there are things out of your con­trol, but the same is true with fi­nal ex­ams. For in­stance, the test-maker could have put some­thing on the test that you didn’t study much for. This en­courages peo­ple to put ex­tra effort into their as­signed task to en­sure ro­bust­ness to out­side forces. • I per­son­ally try to bal­ance keep­ing my­self hon­est by hav­ing some goal out­side but also trust­ing my­self enough to know when I should de­pri­ori­tize the origi­nal goal in fa­vor of some­thing else. For ex­am­ple, let’s say I set a goal to write a blog post about a topic I’m learn­ing in 4 hours, and half-way through I re­al­ize I don’t un­der­stand one of the key un­der­ly­ing con­cepts re­lated to the thing I in­tended to write about. Dur­ing an ac­tual test, the right thing to do would be to do my best given what I know already and finish as many ques­tions as pos­si­ble. But I’d ar­gue that in the blog post case, I very well may be bet­ter off say­ing, “OK I’m go­ing to go learn about this other thing un­til I un­der­stand it, even if I don’t end up finish­ing the post I wanted to write.” The pithy way to say this is that tests are ba­si­cally pure Good­hardt, and it’s dan­ger­ous to turn ev­ery real life task into a game of max­i­miz­ing leg­ible met­rics. • For ex­am­ple, let’s say I set a goal to write a blog post about a topic I’m learn­ing in 4 hours, and half-way through I re­al­ize I don’t un­der­stand one of the key un­der­ly­ing con­cepts re­lated to the thing I in­tended to write about. In­ter­est­ing, this ex­act same thing just hap­pened to me a few hours ago. I was test­ing my tech­nique by writ­ing a post on vari­a­tional au­toen­coders. Halfway through I was very con­fused be­cause I was try­ing to con­trast them to GANs but didn’t have enough ma­te­rial or knowl­edge to know the ad­van­tages of ei­ther. Dur­ing an ac­tual test, the right thing to do would be to do my best given what I know already and finish as many ques­tions as pos­si­ble. But I’d ar­gue that in the blog post case, I very well may be bet­ter off say­ing, “OK I’m go­ing to go learn about this other thing un­til I un­der­stand it, even if I don’t end up finish­ing the post I wanted to write.” I agree that’s prob­a­bly true. How­ever, this cre­ates a bad in­cen­tive where, at least in my case, I will slowly start mak­ing my­self lazier dur­ing the test­ing phase be­cause I know I can always just “give up” and learn the re­quired con­cept af­ter­wards. At least in the case I de­scribed above I just moved onto a differ­ent topic, be­cause I was kind of get­ting sick of vari­a­tional au­toen­coders. How­ever, I was able to do this be­cause I didn’t have any ex­ter­nal con­straints, un­like the method I de­scribed in the par­ent com­ment. The pithy way to say this is that tests are ba­si­cally pure Good­hardt, and it’s dan­ger­ous to turn ev­ery real life task into a game of max­i­miz­ing leg­ible met­rics. That’s true, al­though per­haps one could de­vise a suffi­ciently com­plex test such that it matches perfectly with what we re­ally want… well, I’m not say­ing that’s a solved prob­lem in any sense. • I think you might be good­heart­ing a bit (mis­tak­ing the mea­sure for the goal) when you claim that fi­nal exam perfor­mance is pro­duc­tive. The ac­tual product is the study­ing and prep for the exam, not the exam it­self. The time limits and iso­lated en­vi­ron­ment is helpful in proc­tor­ing (it en­sures the out­put is limited enough to be able to grade, and en­sures that no out­side sources are be­ing used), not for pro­duc­tivity. That’s not to say that these el­e­ments (iso­la­tion, con­cen­tra­tion, time aware­ness, ex­pec­ta­tion of a grad­ing/​scor­ing rubric) aren’t im­por­tant, just that they’re not nec­es­sar­ily suffi­cient nor di­rectly con­vert­ible from an exam set­ting. • Re­lated to: The Lot­tery of Fas­ci­na­tions, other posts probably When you are older, you will learn that the first and fore­most thing which any or­di­nary per­son does is noth­ing. I will oc­ca­sion­ally come across some­one who I con­sider to be ex­traor­di­nar­ily pro­duc­tive, and yet when I ask what they did on a par­tic­u­lar day they will re­spond, “Oh I ba­si­cally did noth­ing.” This is par­tic­u­larly frus­trat­ing. If they did noth­ing, then what was all that work that I saw! I think this comes down to what we mean by do­ing noth­ing. There’s a literal mean­ing to do­ing noth­ing. It could mean sit­ting in a chair, star­ing blankly at a wall, with­out mov­ing a mus­cle. More prac­ti­cally, what peo­ple mean by do­ing noth­ing is that they are do­ing some­thing un­re­lated to their stated task, such as check­ing Face­book, chat­ting with friends, brows­ing Red­dit etc. When pro­duc­tive peo­ple say that they are “do­ing noth­ing” it could just be that they are mod­est, and don’t want to sig­nal how pro­duc­tive they re­ally are. On the other hand, I think that there is a real sense in which these pro­duc­tive peo­ple truly be­lieve that they are do­ing noth­ing. Even if their “do­ing noth­ing” was your “do­ing work”, to them it’s still a “do­ing noth­ing” be­cause they weren’t do­ing the thing they ex­plic­itly set out to do. I think, there­fore, there is some­thing of a “do noth­ing” differ­en­tial, which helps ex­plain why some peo­ple are more pro­duc­tive than oth­ers. For some peo­ple who are less pro­duc­tive than me, their “do­ing noth­ing” might just be play­ing video games. For me, my “do­ing noth­ing” is watch­ing peo­ple de­bate the head­line of a Red­dit news ar­ti­cle (and I’m not proud of this). For those more pro­duc­tive than me, per­haps their “do­ing noth­ing” is read­ing blog posts that are tan­gen­tially re­lated to what they are work­ing on. For peo­ple more pro­duc­tive still, it might be ob­ses­sively re-read­ing ar­ti­cles di­rectly ap­pli­ca­ble to their work. And for Ter­ence Tao, his “do­ing noth­ing” might be read­ing math pa­pers in fields other than the one he is sup­posed to be cur­rently work­ing in. • The case for study­ing mesa optimization Early elu­ci­da­tions of the al­ign­ment prob­lem fo­cused heav­ily on value speci­fi­ca­tion. That is, they fo­cused on the idea that given a pow­er­ful op­ti­mizer, we need some way of spec­i­fy­ing our val­ues so that the pow­er­ful op­ti­mizer can cre­ate good out­comes. Since then, re­searchers have iden­ti­fied a num­ber of ad­di­tional prob­lems be­sides value speci­fi­ca­tion. One of the biggest prob­lems is that in a cer­tain sense, we don’t even know how to op­ti­mize for any­thing, much less a perfect speci­fi­ca­tion of hu­man val­ues. Let’s as­sume we could get a util­ity func­tion con­tain­ing ev­ery­thing hu­man­ity cares about. How would we go about op­ti­miz­ing this util­ity func­tion? The de­fault mode of think­ing about AI right now is to train a deep learn­ing model that performs well on some train­ing set. But even if we were able to cre­ate a train­ing en­vi­ron­ment for our model that re­flected the world very well, and re­warded it each time it did some­thing good, ex­actly in pro­por­tion to how good it re­ally was in our perfect util­ity func­tion… this still would not be guaran­teed to yield a pos­i­tive ar­tifi­cial in­tel­li­gence. This prob­lem is not a su­perfi­cial one ei­ther—it is in­trin­sic to the way that ma­chine learn­ing is cur­rently ac­com­plished. To be more spe­cific, the way we con­structed our AI was by search­ing over some class of mod­els , and se­lect­ing those mod­els which tended to do well on the train­ing set. Cru­cially, we know al­most noth­ing about the model which even­tu­ally gets se­lected. The most we can say is that our AI , but since was such a broad class, this pro­vides us very lit­tle in­for­ma­tion about what the model is ac­tu­ally do­ing. This is similar to the mis­take evolu­tion made when de­sign­ing us. Un­like evolu­tion, we can at least put some hand-crafted con­straints, like a reg­u­lariza­tion penalty, in or­der to guide our AI into safe re­gions of . We can also open up our mod­els and see what’s in­side, and in prin­ci­ple simu­late ev­ery as­pect of their in­ter­nal op­er­a­tions. But now this still isn’t look­ing very good, be­cause we barely know any­thing about what type of com­pu­ta­tions are safe. What would we even look for? To make mat­ters worse, our cur­rent meth­ods for ML trans­parency are abysmally ill equipped to the task of tel­ling us what is go­ing on in­side. The de­fault out­come of all of this is that even­tu­ally, as grows larger with com­pute be­com­ing cheaper and bud­gets get­ting big­ger, gra­di­ent de­scent is bound to hit pow­er­ful op­ti­miz­ers who do not share our val­ues. • When I look back at things I wrote a while ago, say months back, or years ago, I tend to cringe at how naive many of my views were. Faced with this in­evitable pro­gres­sion, and the vir­tual cer­tainty that I will con­tinue to cringe at views I now hold, it is tempt­ing to dis­con­nect from so­cial me­dia and the in­ter­net and only com­ment when I am con­fi­dent that some­thing will look good in the fu­ture. At the same time, I don’t re­ally think this is a good at­ti­tude for sev­eral rea­sons: • Writ­ing things up forces my thoughts to be more ex­plicit, im­prov­ing my abil­ity to think about things • Allow­ing my ideas to be cri­tiqued al­lows for a quicker tran­si­tion to­wards cor­rect beliefs • I tend to learn a lot when writ­ing things • Peo­ple who don’t un­der­stand the con­cept of “This per­son may have changed their mind in the in­ter­ven­ing years”, aren’t worth im­press­ing. I can imag­ine sce­nar­ios where your eco­nomic and so­cial cir­cum­stances are so pre­car­i­ous that the in­cen­tives leave you with no choice but to let your speech and your thought be ruled by un­think­ing mob so­cial-pun­ish­ment mechanisms. But you should at least check whether you ac­tu­ally live in that world be­fore sur­ren­der­ing. • In real world, peo­ple usu­ally for­get what you said 10 years ago. And even if they don’t, say­ing “Matthew said this 10 years ago” doesn’t have the same power as you say­ing the thing now. But the in­ter­net re­mem­bers for­ever, and your words from 10 years ago can be retweeted and be­come al­ive as if you said them now. A pos­si­ble solu­tion would be to use a nick­name… and when­ever you no­tice you grew up so much that you no longer iden­tify with the words of your nick­name, pick up a new one. Also new ac­counts on so­cial net­works, and re-friend only those peo­ple you still con­sider wor­thy. Well, in this case the abrupt change would be the un­nat­u­ral thing, but per­haps you could still keep us­ing your pre­vi­ous ac­count for some time, but mostly pas­sively. As your real-life new self would have differ­ent opinions, differ­ent hob­bies, and differ­ent friends than your self from 10 years ago, so would your on­line self. Un­for­tu­nately, this solu­tion goes against “terms of ser­vice” of al­most all ma­jor web­site. On the ad­ver­tise­ment-driven web, ad­ver­tisers want to know your his­tory, and they are the real cus­tomers… you are only a product. • I agree with Wei Dai that we should use our real names for on­line fo­rums, in­clud­ing Less­wrong. I want to briefly list some benefits of us­ing my real name, • It means that peo­ple can eas­ily rec­og­nize me across web­sites, for ex­am­ple from Face­book and Less­wrong si­mul­ta­neously. • Over time my real name has been sta­ble whereas my user­names have changed quite a bit over the years. For some very old ac­counts, such as those I cre­ated 10 years ago, this means that I can’t re­mem­ber my ac­count name. Us­ing my real name would have averted this situ­a­tion. • It mo­ti­vates me to put more effort into my posts, since I don’t have any dis­in­hi­bi­tion from be­ing anony­mous. • It of­ten looks more for­mal than a silly user­name, and that might make peo­ple take my posts more se­ri­ously than they oth­er­wise would have. • Similar to what Wei Dai said, it makes it eas­ier for peo­ple to rec­og­nize me in per­son, since they don’t have to mem­o­rize a map­ping from user­names to real names in their heads. That said, there are some sig­nifi­cant down­sides, and I sym­pa­thize with peo­ple who don’t want to use their real names. • It makes it much eas­ier for peo­ple to dox you. There are some very bad ways that this can man­i­fest. • If you say some­thing stupid, your rep­u­ta­tion is now di­rectly on the line. Some peo­ple change ac­counts ev­ery few years, as they don’t want to be as­so­ci­ated with the stupid per­son they were a few years ago. • Some­times dis­in­hi­bi­tion from be­ing anony­mous is a good way to spur cre­ativity. I know that I was a lot less care­ful in my pre­vi­ous non-real-name ac­counts, and my writ­ing style was differ­ent—per­haps in a way that made my writ­ing bet­ter. • Your real name might sound bor­ing, whereas your on­line user­name can sound awe­some. • Th­ese days my rea­son for not us­ing full name is mostly this: I want to keep my pro­fes­sional and pri­vate lives sep­a­rate. And I have to use my real name at job, there­fore I don’t use it on­line. What I prob­a­bly should have done many years ago, is make up a new, plau­si­bly-sound­ing full name (per­haps keep my first name and just make up a new sur­name?), and use it con­sis­tently on­line. Maybe it’s still not too late; I just don’t have any sur­name ideas that feel right. • Some­times you need some­one to give the naive view, but do­ing so hurts the rep­u­ta­tion of the per­son stat­ing it. For ex­am­ple sup­pose X is the naive view and Y is a more so­phis­ti­cated view of the same sub­ject. For sake of ar­gu­ment sup­pose X is cor­rect and con­tra­dicts Y. Given 6 peo­ple, maybe 1 of them starts off be­liev­ing Y. 2 peo­ple are un­cer­tain, and 3 peo­ple think X. In the world where peo­ple have their user­names at­tached. The 3 peo­ple who be­lieve X now have a co­or­di­na­tion prob­lem. They each face a lo­cal dis­in­cen­tive to state the case for X, al­though they definitely want _some­one_ to say it. The equil­ibrium here is that no one makes the case for X and the two un­cer­tain peo­ple get per­suaded to view Y. How­ever if some­one is anony­mous and doesn’t care that much about their rep­u­ta­tion, they may just go ahead and state the case for X, pro­vid­ing much bet­ter in­for­ma­tion to the un­de­cided peo­ple. This makes me happy there are some smart peo­ple post­ing un­der pseudonyms. I claim it is a pos­i­tive fac­tor for the epistemics of LessWrong. • Another is­sue I’d add is that real names are po­ten­tially too generic. Ba­si­cally, if ev­ery­one used their real name, how many John Smiths would there be? Would it be con­fus­ing? The rigidity around 1 user­name/​alias per per­son on most plat­forms forces peo­ple to adopt mostly mem­o­rable names that should dis­t­in­guish them from the crowd. • I’ve of­ten wished that con­ver­sa­tion norms shifted to­wards mak­ing things more con­sen­sual. The prob­lem is that when two peo­ple are talk­ing, it’s of­ten the case that one party brings up a new topic with­out re­al­iz­ing that the other party didn’t want to talk about that, or doesn’t want to hear it. Let me provide an ex­am­ple: Per­son A and per­son B are hav­ing a con­ver­sa­tion about the exam that they just took. Per­son A bombed the exam, so they are pretty bummed. Per­son B, how­ever, did great and wants to tell ev­ery­one. So then per­son B comes up to per­son A and asks “How did you do?” fully ex­pect­ing to brag the sec­ond per­son A an­swers. On it’s own, this ques­tion is be­nign. This hap­pens fre­quently with­out ques­tion. On the other hand, if per­son B had said, “Do you want to talk about the exam?” per­son A might have said “No.” This prob­lem can be alle­vi­ated by sim­ply ask­ing peo­ple whether they want to talk about cer­tain things. For sen­si­tive top­ics, like poli­tics and re­li­gion, this is already the norm in some places. I think it can be taken fur­ther. I sug­gest the fol­low­ing bound­aries, and could prob­a­bly think of more if pressed: • Ask some­one be­fore shar­ing some­thing that puts you in a pos­i­tive light. Make it ex­plicit that you are brag­ging. For ex­am­ple, ask “Can I brag about some­thing?” be­fore do­ing so. • Ask some­one be­fore talk­ing about some­thing that you know there’s a high var­i­ance of difficulty and suc­cess. This ap­plies to a lot of things: school, jobs, marathon run­ning times. • Have you read the posts on ask, tell, and guess cul­ture? They feel highly re­lated to this idea. • The prob­lem is, if a con­ver­sa­tional topic can be hurt­ful, the meta-topic can be too. “do you want to talk about the test” could be as bad or worse than talk­ing about the test, if it’s taken as a refer­ence to a judge­ment-wor­thy sen­si­tivity to the topic. And “Can I ask you if you want to talk about whether you want to talk about the test” is just silly. Mr-hire’s com­ment is spot-on—there are var­i­ant cul­tural ex­pec­ta­tions that may ap­ply, and you can’t re­ally unilat­er­ally de­cide an­other norm is bet­ter (though you can have opinions and de­fault stances). The only way through is to be some­what aware of the con­ver­sa­tional sig­nals about what top­ics are wel­come and what should be deferred un­til an­other time. You don’t need prior agree­ment if you can take the hint when an un­usu­ally-brief non-re­sponse is given to your con­ver­sa­tional bid. If you’re rou­tinely miss­ing hints (or see­ing hints that aren’t), and the more di­rect dis­cus­sions are ALSO un­com­fortable for them or you, then you’ll prob­a­bly have to give up on that level of con­nec­tion with that per­son. • “do you want to talk about the test” could be as bad or worse than talk­ing about the test, if it’s taken as a refer­ence to a judge­ment-wor­thy sen­si­tivity to the topic I agree. Although if you are known for ask­ing those types of ques­tions maybe peo­ple will learn to un­der­stand you never mean it as a judge­ment. And “Can I ask you if you want to talk about whether you want to talk about the test” is just silly. True, al­though I’ll usu­ally take silly over judge­ment any day. :) • Sig­nal boost­ing a Less­wrong-ad­ja­cent au­thor from the late 1800s and early 1900s Via a friend, I re­cently dis­cov­ered the zo­ol­o­gist, an­i­mal rights ad­vo­cate, and au­thor J. Howard Moore. His at­ti­tudes to­wards the world re­flect con­tem­po­rary at­ti­tudes within effec­tive al­tru­ism about sci­ence, the place of hu­man­ity in na­ture, an­i­mal welfare, and the fu­ture. Here are some quotes which read­ers may en­joy, Oh, the hope of the cen­turies and the cen­turies and cen­turies to come! It seems some­times that I can al­most see the shin­ing spires of that Ce­les­tial Civil­i­sa­tion that man is to build in the ages to come on this earth—that Civil­i­sa­tion that will jewel the land masses of this planet in that sub­lime time when Science has wrought the mir­a­cles of a mil­lion years, and Man, no longer the sav­age he now is, breathes Jus­tice and Brother­hood to ev­ery be­ing that feels. But we are a part of Na­ture, we hu­man be­ings, just as truly a part of the uni­verse of things as the in­sect or the sea. And are we not as much en­ti­tled to be con­sid­ered in the se­lec­tion of a model as the part ‘red in tooth and claw’? At the feet of the tiger is a good place to study the den­ti­tion of the cat fam­ily, but it is a poor place to learn ethics. Na­ture is the uni­verse, in­clud­ing our­selves. And are we not all the time tin­ker­ing at the uni­verse, es­pe­cially the gar­den patch that is next to us—the earth? Every time we dig a ditch or plant a field, dam a river or build a town, form a gov­ern­ment or gut a moun­tain, slay a for­est or form a new re­s­olu­tion, or do any­thing else al­most, do we not change and re­form Na­ture, make it over again and make it more ac­cept­able than it was be­fore? Have we not been work­ing hard for thou­sands of years, and do our poor hearts not al­most faint some­times when we think how far, far away the mil­len­nium still is af­ter all our efforts, and how long our lit­tle graves will have been for­got­ten when that blessed time gets here? The defect in this ar­gu­ment is that it as­sumes that the ba­sis of ethics is life, whereas ethics is con­cerned, not with life, but with con­scious­ness. The ques­tion ever asked by ethics is not, Does the thing live? but. Does it feel? It is im­pos­si­ble to do right and wrong to that which is in­ca­pable of sen­tient ex­pe­rience. Ethics arises with con­scious­ness and is co­ex­ten­sive with it. We have no eth­i­cal re­la­tion to the clod, the molecule, or the scale sloughed off from our skin on the back of our hand, be­cause the clod, the molecule, and the scale have no feel­ing, no soul, no any­thing ren­der­ing them ca­pa­ble of be­ing af­fected by us [...] The fact that a thing is an or­ganism, that it has or­gani­sa­tion, has in it­self no more eth­i­cal sig­nifi­cance than the fact that it has sym­me­try, or red­ness, or weight. In the ideal uni­verse the life and hap­piness of no be­ing are con­tin­gent on the suffer­ing and death of any other, and the fact that in this world of ours life and hap­piness have been and are to-day so com­monly main­tained by the in­flic­tion of mis­ery and death by some be­ings on oth­ers is the most painful fact that ever en­tered an en­light­ened mind. • I keep won­der­ing why many AI al­ign­ment re­searchers aren’t us­ing the al­ign­ment­fo­rum. I have met quite a few peo­ple who are work­ing on al­ign­ment who I’ve never en­coun­tered on­line. I can think of a few rea­sons why this might be, • Peo­ple find it eas­ier to iter­ate on their work with­out hav­ing to write things up • Peo­ple don’t want to share their work, po­ten­tially be­cause they think a pri­vate-by-de­fault policy is bet­ter. • It is too cum­ber­some to in­ter­act with other re­searchers through the in­ter­net. In-per­son in­ter­ac­tions are easier • They just haven’t even con­sid­ered from a first per­son per­spec­tive whether it would be worth it • For­give me for cliche sci­en­tism, but I re­cently re­al­ized that I can’t think of any ma­jor philo­soph­i­cal de­vel­op­ments in the last two cen­turies that oc­curred within aca­demic philos­o­phy. If I were to try to list ma­jor philo­soph­i­cal achieve­ments since 1819, these would likely ap­pear on my list, but none of them were from those trained in philos­o­phy: • A con­vinc­ing, sim­ple ex­pla­na­tion for the ap­par­ent de­sign we find in the liv­ing world (Dar­win and Wal­lace). • The unifi­ca­tion of time and space into one fabric (Ein­stein) • A solid foun­da­tion for ax­io­matic math­e­mat­ics (Zer­melo and Fraenkel). • A model of com­pu­ta­tion, and a plau­si­ble frame­work for ex­plain­ing men­tal ac­tivity (Tur­ing and Church). By con­trast, if we go back to pre­vi­ous cen­turies, I don’t have much of an is­sue cit­ing philo­soph­i­cal achieve­ments from philoso­phers: • The iden­ti­fi­ca­tion of the pain-plea­sure axis as the pri­mary source of value (Ben­tham). • Ad­vanced no­tions of causal­ity, re­duc­tion­ism, sci­en­tific skep­ti­cism (Hume) • Ex­ten­sion of moral sym­pa­thies to those in the an­i­mal king­dom (too many philoso­phers to name) • A high­light of the value of wis­dom and learned de­bate (Socrates, and oth­ers) Of course, this is prob­a­bly caused my by bias to­wards Less­wrong-ad­ja­cent philos­o­phy. If I had to pick philoso­phers who have made ma­jor con­tri­bu­tions, these peo­ple would be on my short­list: John Stu­art Mill, Karl Marx, Thomas Nagel, Derek Parfit, Ber­trand Rus­sell, Arthur Schopen­hauer. • I would name the fol­low­ing: • My im­pres­sion is that aca­demic philos­o­phy has his­tor­i­cally pro­duced a lot of good de­con­fu­sion work in metaethics (e.g. this and this), as well as some re­ally neat nega­tive re­sults like the log­i­cal em­piri­cists’ failed at­tempt to con­struct a lan­guage in which ver­bal propo­si­tions could be cached out/​an­a­lyzed in terms of logic or set the­ory in a way similar to how one can cache out/​an­a­lyze Python in terms of ma­chine code. In re­cent times there’s been a lot of (in my opinion) great aca­demic philos­o­phy done at FHI. • Those are all pretty good. :) • Wow! You left out the whole of an­a­lyt­i­cal philos­o­phy! • I’m not say­ing that I’m proud of this fact. It is mostly that I’m ig­no­rant of it. :) • The de­vel­op­ment of mod­ern for­mal logic (pred­i­cate logic, modal logic, the equiv­alence of higher-or­der log­ics and set-the­ory, etc.), which is of course deeply re­lated to Zer­melo, Fraenkel, Tur­ing and Church, but which in­volved philoso­phers like Quine, Put­nam, Rus­sell, Kripke, Lewis and oth­ers. • The model of sci­en­tific progress as pro­ceed­ing via pre-paradig­matic, paradig­matic, and rev­olu­tion­ary stages (from Kuhn, who wrote as a philoso­pher, though trained as a physi­cist) • The iden­ti­fi­ca­tion of the pain-plea­sure axis as the pri­mary source of value (Ben­tham). I will mark that I think this is wrong, and if any­thing I would de­scribe it as a philo­soph­i­cal dead-end. Com­plex­ity of value and all of that. So list­ing it as a philo­soph­i­cal achieve­ment seems back­wards to me. • I might add that I also con­sider the de­vel­op­ment of eth­i­cal anti-re­al­ism to be an­other, per­haps more in­sight­ful, achieve­ment. But this de­vel­op­ment is, from what I un­der­stand, usu­ally at­tributed to Hume. Depend­ing on what you mean by “plea­sure” and “pain” it is pos­si­ble that you merely have a sim­ple con­cep­tion of the two words which makes this iden­ti­fi­ca­tion in­com­pat­i­ble with com­plex­ity of value. The ro­bust form of this dis­tinc­tion was pro­vided by John Stu­art Mill who iden­ti­fied that some forms of plea­sure can be more valuable than oth­ers (which is hon­estly quite similar to what we might find in the fun the­ory se­quence...). In its mod­ern for­mu­la­tion, I would say that Ben­tham’s con­tri­bu­tion was iden­ti­fy­ing con­scious states as be­ing the pri­mary the­ater for which value can ex­ist. I can hardly dis­agree, as I strug­gle to imag­ine things in this world which could pos­si­bly have value out­side of con­scious ex­pe­rience. Still, I think there are per­haps some, which is why I con­ceded by us­ing the words “pri­mary source of value” rather than “sole source of value.” To the ex­tent that com­plex­ity of value dis­agrees with what I have writ­ten above, I in­cline to dis­agree with com­plex­ity of value :). • (I think you and habryka in fact dis­agree pretty deeply here) • Then I will as­sert that I would in fact ap­pre­ci­ate see­ing the rea­sons for dis­agree­ment, even as the case may be that it comes down to ax­io­matic in­tu­itions. • I’ve heard a sur­pris­ing num­ber of peo­ple crit­i­cize par­ent­ing re­cently us­ing some pretty harsh la­bels. I’ve seen peo­ple call it a form of “Stock­holm syn­drome” and a breach of liberty, morally un­nec­es­sary etc. This seems kind of weird to me, be­cause it doesn’t re­ally match my ex­pe­rience as a child at all. I do agree that par­ents can some­times vi­o­late liberty, and so I’d pre­fer a world where chil­dren could break free from their par­ents with­out penalties. But I also think that most chil­dren gen­uinely love their par­ents and so wouldn’t want to do so. I think if you de­ride this as merely “Stock­holm syn­drome” then you are un­fairly un­der­valu­ing the gen­uine na­ture of the re­la­tion­ship in most cases, and I dis­agree with you here. As an in­di­vi­d­ual, I would to­tally let an in­tent al­igned AGI man­age most of my life, and give me sug­ges­tions. Of course, if I dis­agreed with a course of ac­tion it sug­gested, I would want it to give a non-ma­nipu­la­tive ar­gu­ment to per­suade me that it knows best, rather than sim­ply forc­ing me into the al­ter­na­tive. In other words, I’d want some sort of weak pa­ter­nal­ism on the part of an AGI. So, as a per­son who wants this type of thing, I can re­ally see the mer­its of hav­ing par­ents who care for chil­dren. In some ways they are in­tent al­igned GIs. Now, some par­ents are much more strict, and free­dom re­strict­ing, and less trans­par­ent than what we would want in a full blown guardian su­per­in­tel­li­gence—but this just seems like an ar­gu­ment that there ex­ist bad par­ents, not that this type of pa­ter­nal­ism is bad. • Yeah, that’s one ar­gu­ment for tra­di­tion: it’s sim­ply not the pit of mis­ery that its de­trac­tors claim it to be. But for par­ent­ing in par­tic­u­lar, I think I can give an even stronger ar­gu­ment. Chil­dren aren’t lit­tle seeds of good­ness that just need to be set free. They are more like lit­tle seeds of any­thing. If you won’t shape their val­ues, there’s no short­age of other forces in the world that would love to shape your chil­dren’s val­ues, with­out hav­ing their in­ter­ests at heart. • Chil­dren aren’t lit­tle seeds of good­ness that just need to be set free. They are more like lit­tle seeds of any­thing Tod­dlers, yes. If we’re talk­ing about peo­ple over the age of say, 8, then it be­comes less true. By the time they are a teen, it be­comes pretty false. And yet peo­ple still say that le­gal sep­a­ra­tion at 18 is good. If you are merely mak­ing the ar­gu­ment that we should limit their ex­po­sure to things that could in­fluence them in harm­ful di­rec­tions, then I’d ar­gue that this never stops be­ing a pow­er­ful force, in­clud­ing for peo­ple well into adult­hood and in old age. • Huh? Most 8 year olds can’t even make them­selves study in­stead of play­ing Fort­nite, and cer­tainly don’t un­der­stand the is­sues with un­planned preg­nan­cies. I’d say 16-18 is about the right age where peo­ple can start rely­ing on in­ter­nal struc­ture in­stead of ex­ter­nal. Many take even longer, and need to join the army or some­thing. • Some­times peo­ple will pro­pose ideas, and then those ideas are met im­me­di­ately af­ter with harsh crit­i­cism. A very com­mon ten­dency for hu­mans is to defend our ideas and work against these crit­i­cisms, which of­ten gets us into a state that peo­ple re­fer to as “defen­sive.” Ac­cord­ing to com­mon wis­dom, be­ing in a defen­sive state is a bad thing. The ra­tio­nale here is that we shouldn’t get too at­tached to our own ideas. If we do get at­tached, we be­come li­able to be­come crack­pots who can’t give an idea up be­cause it would make them look bad if we did. There­fore, the com­mon wis­dom ad­vo­cates treat­ing ideas as be­ing handed to us by a tablet from the clouds rather than a product of our brain’s think­ing habits. Tak­ing this ad­vice al­lows us to de­tach our­selves from our ideas so that we don’t con­fuse crit­i­cism with in­sults. How­ever, I think the ex­act op­po­site failure mode is not of­ten enough pointed out and guarded against. Speci­fi­cally, the failure mode is be­ing too will­ing to aban­don be­liefs based on sur­face level coun­ter­ar­gu­ments. To alle­vi­ate this I sug­gest we shouldn’t be so ready to give up our ideas in the face of crit­i­cism. This might sound ir­ra­tional—why should we get at­tached to our be­liefs? I’m cer­tainly not ad­vo­cat­ing that we should ac­tu­ally as­so­ci­ate crit­i­cism with in­sults to our char­ac­ter or in­tel­li­gence. In­stead, my ar­gu­ment is that the pro­cess of defen­sively defend­ing against crit­i­cism gen­er­ates a pro­duc­tive ad­ver­sar­ial struc­ture. Con­sider two peo­ple. Per­son A des­per­ately wants to be­lieve propo­si­tion X, and per­son B des­per­ately wants to be­lieve not X. If B comes up to A and says, “Your be­lief in X is un­founded. Here are the rea­sons...” Per­son A can ei­ther ad­mit defeat, or fall into defen­sive mode. If A ad­mits defeat, they might in­deed get closer to the truth. On the other hand, if A gets into defen­sive mode, they might also get closer to the truth in the pro­cess of des­per­ately for ev­i­dence of X. My the­sis is this: the hu­man brain is very good at se­lec­tive search­ing for ev­i­dence. In par­tic­u­lar, given some be­lief that we want to hold onto, we will go to great lengths to jus­tify it, search­ing for ev­i­dence that we oth­er­wise would not have searched for if we were just de­tached from the de­bate. It’s sort of like the differ­ence be­tween a de­bate be­tween two peo­ple who are as­signed their roles by a coin toss, and a de­bate be­tween peo­ple who have spent their en­tire lives jus­tify­ing why they are on one side. The first de­bate is an in­ter­est­ing spec­ta­cle, but I ex­pect the sec­ond de­bate to con­tain much deeper the­o­ret­i­cal in­sight. • A cou­ple of rele­vant posts/​threads that come to mind: • Just like an idea can be wrong, so can be crit­i­cism. It is bad to give up the idea, just be­cause.. • some­one rounded it up to the near­est cliche, and pro­vided the stan­dard cached an­swer; • some­one men­tioned a sci­en­tific ar­ti­cle (that failed to repli­cate) that dis­proves your idea (or some­thing differ­ent, con­tain­ing the same key­words); • some­one got an­gry be­cause it seems to op­pose their poli­ti­cal be­liefs; • etc. My “fa­vorite” ver­sion of wrong crit­i­cism is when some­one ex­per­i­men­tally dis­proves a straw­man ver­sion of your hy­poth­e­sis. Sup­pose your hy­poth­e­sis is “eat­ing veg­eta­bles is good for health”, and some­one makes an ex­per­i­ment where peo­ple are only al­lowed to eat car­rots, noth­ing more. After a few months they get sick, and the au­thor of the ex­per­i­ment pub­lishes a study say­ing “sci­ence proves that veg­eta­bles are ac­tu­ally harm­ful for your health”. (Sup­pose, op­ti­misti­cally, that the au­thor used suffi­ciently large N, and did the statis­tics prop­erly, so there is noth­ing to at­tack from the method­olog­i­cal an­gle.) From now on, when­ever you men­tion that per­haps a diet con­tain­ing more veg­eta­bles could benefit some­one, some­one will send you a link to the ar­ti­cle that “de­bunks the myth” and will con­sider the de­bate closed. So, when I hear about re­search prov­ing that par­ent­ing /​ ed­u­ca­tion /​ ex­er­cise /​ what­ever doesn’t cause this or that, my first re­ac­tion is to won­der how speci­fi­cally did the re­searchers op­er­a­tional­ize such a gen­eral word, and whether the thing they stud­ied even re­sem­bles my case. (And yes, I am aware that the same strat­egy could be used to re­fute any in­con­ve­nient state­ment, such as “as­trol­ogy doesn’t work”—“well, I do as­trol­ogy a bit differ­ently than the peo­ple stud­ied in that ex­per­i­ment, there­fore the con­clu­sion doesn’t ap­ply to me”.) • I think that hu­man level ca­pa­bil­ities in nat­u­ral lan­guage pro­cess­ing (some­thing like GPT-2 but much more pow­er­ful) is likely to oc­cur in some soft­ware sys­tem within 20 years. Since hu­man level ca­pa­bil­ities in nat­u­ral lan­guage pro­cess­ing is a very rich real-world task, I would con­sider a sys­tem with those ca­pa­bil­ities to be ad­e­quately de­scribed as a gen­eral in­tel­li­gence, though it would likely not be very dan­ger­ous due to its lack of world-op­ti­miza­tion ca­pa­bil­ities. This be­lief of mine is based on a few heuris­tics. Below I have col­lected a few claims which I con­sider to be rel­a­tively con­ser­va­tive, and which col­lec­tively com­bine to weakly im­ply my the­sis. Since this is a short-form post I will not provide very spe­cific lines of ev­i­dence. Still, I think that each of my claims could be sub­stan­tially ex­panded upon and/​or steel­manned by adding de­tail from his­tor­i­cal trends and ev­i­dence from cur­rent ML re­search. Claim 1: Cur­rent tech­niques, given enough com­pute, are suffi­cient to perform par-hu­man at nat­u­ral lan­guage pro­cess­ing tasks. This is in some sense triv­ially true since suffi­ciently com­pli­cated RNNs are Tur­ing com­plete. In a more prac­ti­cal sense, I think that there is enough ev­i­dence that cur­rent tech­niques are suffi­cient to perform rudimentary • Sum­ma­riza­tion of text • Auto-com­ple­tion of paragraphs • Q&A • Nat­u­ral conversation Given more com­pute and more data, I don’t see why there would be a fun­da­men­tal stum­bling block for cur­rent ML mod­els to scale to hu­man level on the above tasks. There­fore, I think that hu­man level nat­u­ral lan­guage pro­cesses could be cre­ated to­day with enough fund­ing. Claim 2: Given his­tor­i­cal data and as­sump­tions about fu­ture progress, it is quite likely that the cost for train­ing ML sys­tems will con­tinue to go down in the next decades by sig­nifi­cant amounts (more speci­fi­cally: an or­der of mag­ni­tude). I don’t have much more to add to this other than the fact that I have per­son­ally fol­lowed hard­ware trends on web­sites like video­card­bench­mark.net and my guess is that cre­at­ing neu­ral-net­work spe­cific hard­ware will con­tinue this trend in ML. Claim 3: Creat­ing a sys­tem with hu­man level ca­pa­bil­ities in nat­u­ral lan­guage pro­cess­ing will re­quire a mod­est amount of fund­ing, rel­a­tive to the amount of money large cor­po­ra­tions and gov­ern­ments have at their dis­posal. To be more spe­cific, I es­ti­mate that it would cost less than five billion dol­lars in hard­ware costs in 2019 in­fla­tion ad­justed dol­lars, and per­haps even less than one billion dol­lars. Here’s a rough sketch for an ar­gu­ment for this propo­si­tion: • The cost of repli­cat­ing GPT-2 was$50k. This is likely to be a large over­es­ti­mate, given that the post noted that in­trin­sic costs are much lower.

• Given claim 2, this cost can be pre­dicted to go down to about \$5k within 20 years.

• While the cost for ML sys­tems does not scale lin­early in the num­ber of pa­ram­e­ters, the par­alleliz­abil­ity of ar­chi­tec­tures like the Trans­former al­low for near-lin­ear scal­ing. This is my im­pres­sion from read­ing posts like this one.

• Given the above three state­ments, the cost of run­ning a Trans­former with the same num­ber of pa­ram­e­ters as the high es­ti­mate for the num­ber of synapses in a hu­man brain would naively cost about one billion dol­lars.

Claim 4: There is suffi­cient eco­nomic in­cen­tive such that pro­duc­ing a hu­man-level sys­tem in the do­main of nat­u­ral lan­guage is worth a multi-billion dol­lar in­vest­ment. To me this seems quite plau­si­ble, given just how many jobs re­quire writ­ing pa­pers, memos, or sum­ma­riz­ing text. Com­pare this to a space-race type sce­nario where there be­comes enough pub­lic hype sur­round­ing AI such that gov­ern­ments are throw­ing around one hun­dred fifty billion dol­lars, which is what they did for the ISS. And rel­a­tive to space, AI at least has very di­rect real world benefits!

I un­der­stand there’s a lot to jus­tify these claims. And I haven’t done much work to jus­tify them. But, I’m not presently in­ter­ested in jus­tify­ing these claims to a bunch of judges in­tent on find­ing flaws. My main con­cern is that they all seem likely to me, and there’s also a lot of cur­rent work in out-com­pet­ing com­pa­nies to be first on the nat­u­ral lan­guage bench­marks. It just adds up to me.

Am I miss­ing some­thing? If not, then this ar­gu­ment at least pushes back on claims that there is a neg­ligible chance of gen­eral in­tel­li­gence emerg­ing within the next few decades.

• I ex­pect that hu­man-level lan­guage pro­cess­ing is enough to con­struct hu­man-level pro­gram­ming and math­e­mat­i­cal re­search abil­ity. Aka, com­plete a re­search di­ary the way a hu­man would, by match­ing with pat­terns it has pre­vi­ously seen, just as hu­man math­e­mat­i­ci­ans do. That should be ca­pa­bil­ity enough to go as foom as pos­si­ble.

• If AI is limited by hard­ware rather than in­sight, I find it un­likely that a 300 trillion pa­ram­e­ter Trans­former trained to re­pro­duce math/​CS pa­pers would be able to “go foom.” In other words, while I agree that the sys­tem I have de­scribed would likely be able to do hu­man-level pro­gram­ming (though it would still make mis­takes, just like hu­man pro­gram­mers!) I doubt that this would nec­es­sar­ily cause it to en­ter a quick tran­si­tion to su­per­in­tel­li­gence of any sort.

I sus­pect the sys­tem that I have de­scribed above would be well suited for au­tomat­ing some types of jobs, but would not nec­es­sar­ily al­ter the struc­ture of the econ­omy by a rad­i­cal de­gree.

• It wouldn’t nec­es­sar­ily cause such a quick tran­si­tion, but it could eas­ily be made to. A hu­man with ac­cess to this tool could iter­ate de­signs very quickly, and he could take him­self out of the loop by let­ting the tool pre­dict and ex­e­cute his ac­tions as well, or by piping its code ideas di­rectly into a com­piler, or some other way the tool thinks up.

• My skep­ti­cism is mainly that this would be quicker than nor­mal hu­man iter­a­tion, or that this would sub­stan­tially im­prove upon the strat­egy of sim­ply buy­ing more hard­ware. How­ever, as we see in the re­cent case of eg. roBERTa, there are a few in­sights which sub­stan­tially im­prove upon a sin­gle AI sys­tem. I just re­main skep­ti­cal that a sin­gle hu­man-level AI sys­tem would pro­duce these in­sights faster than a reg­u­lar hu­man team of ex­perts.

In other words, my opinion of re­cur­sive self im­prove­ment in this nar­row case is that it isn’t a fun­da­men­tally differ­ent strat­egy from hu­man over­sight and iter­a­tion. It can be used to au­to­mate some parts of the pro­cess, but I don’t think that foom is nec­es­sar­ily im­plied in any strong sense.

• The de­fault ar­gu­ment that such a de­vel­op­ment would lead to a foom is that an in­sight-based reg­u­lar dou­bling of speed math­e­mat­i­cally reaches a sin­gu­lar­ity in finite time when the speed in­creases pay in­sight div­i­dends. You can’t reach that sin­gu­lar­ity with a flesh­bag in the loop (though it may be un­likely to mat­ter if with him in the loop, you merely dou­ble ev­ery day).

For cer­tain shapes of how speed in­creases de­pend on in­sight and over­sight, there may be a per­verse in­cen­tive to cut your­self out of your loop be­fore the other guy cuts him­self out.

• I gen­er­ally agree with the heuris­tic that we should “live on the main­line”, mean­ing that we should mostly plan for events which cap­ture the dom­i­nant share of our prob­a­bil­ity. This heuris­tic causes me to have a ten­dency to do some of the fol­low­ing things

• Work on pro­jects that I think have a medium-to-high chance of suc­ceed­ing and quickly aban­don things that seem like they are failing.

• Plan my ca­reer tra­jec­tory based on where I think I can plau­si­bly max­i­mize my long term val­ues.

• Study sub­jects only if I think that I will need to un­der­stand them at some point in or­der to grasp an im­por­tant con­cept. See more de­tails here.

• Avoid do­ing work that lev­er­ages small prob­a­bil­ities of ex­cep­tion­ally bad out­comes. For ex­am­ple, I don’t fo­cus my study­ing on worst-case AI safety risk (al­though I do think that an­a­lyz­ing worst-case failure modes is use­ful from the stand­point of a se­cu­rity mind­set).

I see a few prob­lems with this heuris­tic, how­ever, and I’m not sure quite how to re­solve them. More speci­fi­cally, I tend to float freely be­tween differ­ent pro­jects be­cause I am quick to aban­don things if I feel like they aren’t work­ing out (com­pare this to the mind­set that some game de­vel­op­ers have when they re­al­ize their lat­est game idea isn’t very good).

One case where this shows up is when I change my be­liefs about where the most effec­tive ways to spend my time as far as long-term fu­ture sce­nar­ios are con­cerned. I will some­times read an ar­gu­ment about how some line of in­quiry is promis­ing and for an en­tire day be­lieve that this would be a good thing to work on, only for the next day to bring an­other ar­gu­ment.

And things like my AI timeline pre­dic­tions vary er­rat­i­cally, much more than I ex­pect most peo­ple’s: I some­times wake up and think that AI might be just 10 years away and other days I wake up and won­der if most of this stuff is more like a cen­tury away.

This gen­eral be­hav­ior makes me into some­one who doesn’t stay con­sis­tent on what I try to do. My life there­fore re­sem­bles a bat­tle be­tween two com­pet­ing heuris­tics: on one side there’s the heuris­tic of plan­ning for the main­line, and on the other there’s the heuris­tic of com­mit­ting to things even if they aren’t pan­ning out. I am un­sure of the best way to re­solve this con­flict.

• Some ran­dom thoughts:

• Star­tups and pivots. Star­tups re­quire lots of com­mit­ment even when things feel like they’re col­laps­ing – only by perserver­ing through those times can you pos­si­bly make it. Still, star­tups are will­ing to pivot – take their ex­ist­ing in­fras­truc­ture but change key strate­gic ap­proaches.

• Es­ca­lat­ing com­mit­ment. Early on (in most do­mains), you should pick shorter term pro­jects, be­cause the fo­cus is on learn­ing. Code a web­site in a week. Code an­other web­site in 2 months. Don’t stress too much on multi-year plans un­til you’re rea­son­ably con­fi­dent you sorta know what you’re do­ing. (Re­lat­edly, re­la­tion­ships: early on it makes sense to date a lot to get some sense of who/​what you’re look­ing for in a ro­man­tic part­ner. But even­tu­ally, a lot of the good stuff comes when you ac­tu­ally com­mit to longterm re­la­tion­ships that are ca­pa­ble of weath­er­ing pe­ri­ods of strife and doubt)

• Alter­nately: Givewell (or maybe OpenPhil?) did mix­tures of shal­low dives, deep dives and medium dives into cause ar­eas be­cause they learned differ­ent sorts of things from each kind of re­search.

• Com­mit­ment mind­set. Sort of how Nate Soares recom­mends sep­a­rat­ing the feel­ing of con­vic­tion from the epistemic be­lief of high-suc­cess… you can sep­a­rate “I’m go­ing to stick with this pro­ject for a year or two be­cause it’s likely to work” from “I’m go­ing to stick to this pro­ject for a year or two be­cause stick­ing to pro­jects for a year or two is how you learn how pro­jects work on the 1-2 year timescale, in­clud­ing the part where you shift gears and learn from mis­takes and be­come more ro­bust about them.

• Math­e­mat­i­cally, it seems like you should just give your heuris­tic the bet­ter data you already con­sciously have: If your un­trust­wor­thy senses say you aren’t on the main­line, the cor­rect move isn’t nec­es­sar­ily to be­lieve them, but rather to de­cide to put effort into figur­ing it out, be­cause it’s im­por­tant.

It’s clear how your heuris­tic would evolve. To em­brace it cor­rectly, you should make sure that your en­tire life lives in the main­line. If there’s a game with nega­tive ex­pected value, where the worst out­come has chance 10%, and you play it 20 times, that’s stupid. Bud­get the prob­a­bil­ity you are will­ing to throw away for the rest of your life now.

If you don’t think you can stay to your bud­get, if you know that always, you will to­mor­row play an­other round of that game by the same rea­son­ing as to­day, then re­al­ize that to­day’s rea­son­ing de­cides to­day and to­mor­row. Real­ize that the main­line of giv­ing in to the heuris­tic is los­ing even­tu­ally, and let the heuris­tic de­stroy it­self im­me­di­ately.

• I see a few prob­lems with this heuris­tic, how­ever, and I’m not sure quite how to re­solve them. More speci­fi­cally, I tend to float freely be­tween differ­ent pro­jects be­cause I am quick to aban­don things if I feel like they aren’t work­ing out (com­pare this to the mind­set that some game de­vel­op­ers have when they re­al­ize their lat­est game idea isn’t very good).

There are two big is­sues with the “liv­ing in the main­line” strat­egy:

1. Most of the high­est EV ac­tivi­ties are those that have low chance of suc­cess but big re­wards. I sus­pect much of your volatile be­hav­ior is bounc­ing be­tween chas­ing op­por­tu­ni­ties you see as high value, and then re­al­iz­ing you’re not on the main­line and cor­rect­ing, then re­al­iz­ing there are higher EV op­por­tu­ni­ties and cor­rect­ing again.

2. Strate­gies that work well on the main­line of­ten fail spec­tac­u­larly in the face of black swans. So they have a high prob­a­bil­ity of work­ing but also very nega­tive EV in un­likely situ­a­tions (which you ig­nore if you’re only think­ing about the main­line).

Two al­ter­na­tives to the “liv­ing on the main­line” heuris­tic:

1. The Anti-frag­ility heuris­tic:

• Use the bar­bell strat­egy, to split your ac­tivi­ties be­tween sure­fire wins with low up­sides and cer­tainty, and risky moon­shots with low down­sides but lots of un­cer­tainty around up­sides.

• No­tice the rea­sons that things fail, and make them ro­bust to that class of failure in the fu­ture.

• Try lots of things, and stick with the ones that work over time.

2. The Effec­tu­a­tion Heuris­tic:

• Go into ar­eas where you have un­fair ad­van­tages.

• Spread your down­side risk to peo­ple or or­ga­ni­za­tions who can han­dle it.

• In gen­er­ally, work to CREATE the main­line where you have an un­fair ad­van­tage and high up­side.

You might get some mileage out of read­ing the effec­tu­a­tion and anti-frag­ility sec­tions of this post.

• Re­lated to: Real­ism about rationality

I have talked to some peo­ple who say that they value eth­i­cal re­flec­tion, and would pre­fer that hu­man­ity re­flected for a very long time be­fore coloniz­ing the stars. In a sense I agree, but at the same time I can’t help but think that “re­flec­tion” is a vac­u­ous feel-good word that has no shared com­mon mean­ing.

Some forms of re­flec­tion are clearly good. Epistemic re­flec­tion is good if you are a con­se­quen­tial­ist, since it can help you get what you want. I also agree that nar­row forms of re­flec­tion can also be good. One ex­am­ple of a nar­row form of re­flec­tion is philo­soph­i­cal re­flec­tion where we com­pare the de­tails of two pos­si­ble out­comes and then de­cide which one is bet­ter.

How­ever, there are much broader forms of re­flec­tion which I’m less hes­i­tant to en­dorse. Namely, the vague types of re­flec­tion, such as re­flect­ing on whether we re­ally value hap­piness, or whether we should re­ally truly be wor­ried about an­i­mal suffer­ing.

I can per­haps sym­pa­thize with the in­tu­ition that we should re­ally try to make sure that what we put into an AI is what we re­ally want, rather than just what we su­perfi­cially want. But fun­da­men­tally, I have skep­ti­cism that there is any canon­i­cal way of do­ing this type of re­flec­tion that leads to non-ar­bi­trari­ness.

I have heard some­thing along the lines of “I would want a re­flec­tive pro­ce­dure that ex­trap­o­lates my val­ues as long as the pro­ce­dure wasn’t de­ceiv­ing me or had some ul­te­rior mo­tive” but I just don’t see how this type of re­flec­tion cor­re­sponds to any nat­u­ral class. At some point, we will just have to put some ar­bi­trari­ness into the value sys­tem, and there won’t be any “right an­swer” about how the ex­trap­o­la­tion is done.

• The vague re­flec­tions you are refer­ring to are analo­gous to some­body say­ing “I should re­ally ex­er­cise more” with­out ever do­ing it. I agree that the mere promise of re­flec­tion is use­less.

But I do think that re­flec­tions about the vague top­ics are im­por­tant and pos­si­ble. Ac­tively work­ing through one’s ex­pe­riences, read­ing rele­vant books, dis­cussing ques­tions with in­tel­li­gent peo­ple can lead to epipha­nies (and even­tu­ally life choices), that wouldn’t have oc­curred oth­er­wise.

How­ever, this is not done with a push of a but­ton and these things don’t hap­pen ran­domly—they will only emerge if you are pre­pared to in­vest a lot of time and en­ergy.

All of this hap­pens on a per­sonal level. To use your ex­am­ple, some­body may con­clude from his own life ex­pe­rience that liv­ing a life of pur­pose is more im­por­tant to him than to live a life of hap­piness. How to for­mal­ize this pro­cess so that an AI could use a canon­i­cal way to achieve it (and in­fer some­body’s real val­ues sim­ply by ob­serv­ing) is be­yond me. It would have to know a lot more about us than is com­fortable for most of us.

• After writ­ing the post on us­ing trans­parency reg­u­lariza­tion to help make neu­ral net­works more in­ter­pretable, I have be­come even more op­ti­mistic that this is a po­ten­tially promis­ing line of re­search for al­ign­ment. This is be­cause I have no­ticed that there are a few prop­er­ties about trans­parency reg­u­lariza­tion which may al­low it to avoid some pit­falls of bad al­ign­ment pro­pos­als.

To be more spe­cific, in or­der for a line of re­search to be use­ful for al­ign­ment, it helps if

• The line of re­search doesn’t re­quire un­nec­es­sar­ily large amounts of com­pu­ta­tions to perform. This would al­low the tech­nique to stay com­pet­i­tive, re­duc­ing the in­cen­tive to skip safety pro­to­cols.

• It doesn’t re­quire hu­man mod­els to work. This is use­ful because

• Hu­man mod­els are black­boxes and are them­selves mesa-optimizers

• We would be limited pri­mar­ily to the­o­ret­i­cal work in the pre­sent, since hu­man cog­ni­tion is ex­pen­sive to ob­tain.

• Each part of the line of re­search is re­cur­sively leg­ible. That is, if we use the tech­nique on our ML model, we should ex­pect that the tech­nique it­self can be ex­plained with­out ap­peal­ing to some other black box.

Trans­parency reg­u­lariza­tion meets these three crite­rion re­spec­tively, because

• It doesn’t need to be as­tro­nom­i­cally more ex­pen­sive than more typ­i­cal forms of regularization

• It doesn’t nec­es­sar­ily re­quire hu­man-level cog­ni­tive parts to get work­ing.

• It is po­ten­tially quite sim­ple math­e­mat­i­cally, and so definitely meets the re­cur­sively leg­ible crite­rion.

• In dis­cus­sions about con­scious­ness I find my­self re­peat­ing the same ba­sic ar­gu­ment against the ex­is­tence of qualia con­stantly. I don’t do this just to be an­noy­ing: It is just my ex­pe­rience that

1. Peo­ple find con­scious­ness re­ally hard to think about and has been known to cause a lot of dis­agree­ments.

2. Per­son­ally I think that this par­tic­u­lar ar­gu­ment dis­solved per­haps 50% of all my con­fu­sion about the topic, and was one of the sim­plest, clear­est ar­gu­ments that I’ve ever seen.

I am not be­ing origi­nal ei­ther. The ar­gu­ment is the same one that has been used in var­i­ous forms across Illu­sion­ist/​Elimi­na­tivist liter­a­ture that I can find on the in­ter­net. Eliezer Yud­kowsky used a ver­sion of it many years ago. Even David Chalmers, who is quite the formidable con­scious­ness re­al­ist, ad­mits in The Meta-Prob­lem of Con­scious­ness that the ar­gu­ment is the best one he can find against his po­si­tion.

The ar­gu­ment is sim­ply this:

If we are able to ex­plain why you be­lieve in, and talk about qualia with­out refer­ring to qualia what­so­ever in our ex­pla­na­tion, then we should re­ject the ex­is­tence of qualia as a hy­poth­e­sis.

This is the stan­dard de­bunk­ing ar­gu­ment. It has a more gen­eral form which can be used to deny the ex­is­tence of a lot of other non-re­duc­tive things: dis­tinct per­sonal iden­tities, gods, spirits, liber­tar­ian free will, a mind-in­de­pen­dent moral­ity etc. In some sense it’s just an ex­tended ver­sion of Oc­cam’s ra­zor, show­ing us that qualia don’t do any­thing in our phys­i­cal the­o­ries, and thus can be re­jected as things that ac­tu­ally ex­ist out there in any sense.

To me this ar­gu­ment is very clear, and yet I find my­self ar­gu­ing it a lot. I am not sure how else to get peo­ple to see my side of it other than send­ing them a bunch of ar­ti­cles which more-or-less make the ex­act same ar­gu­ment but from differ­ent per­spec­tives.

I think the hu­man brain is built to have a blind spot on a lot of things, and con­scious­ness is per­haps one of them. I think quite a bit how if hu­man­ity is not able to think clearly about this thing which we have spent many re­search years on, then it seems like there might be some other low hang­ing philo­soph­i­cal fruits still re­main­ing.

Ad­den­dum: I am not say­ing I have con­scious­ness figured out. How­ever, I think it’s analo­gous to how athe­ists haven’t “got re­li­gion figured out” yet they have at the very least taken their first steps by ac­tu­ally re­ject­ing re­li­gion. It’s not a full the­ory of re­li­gious be­lief, or even a the­ory at all. It’s just the first thing you do if you want to un­der­stand the sub­ject. I roughly agree with Keith Frank­ish’s take on the mat­ter.

• If we are able to ex­plain why you be­lieve in, and talk about qualia with­out refer­ring to qualia what­so­ever in our ex­pla­na­tion, then we should re­ject the ex­is­tence of qualia as a hy­poth­e­sis.

And I as­sume your claim is that we can ex­plain why I be­lieve in Qualia with­out refer­ring to qualia?

Edit: to be clear, I don’t re­ally much why other peo­ple talk about qualia. I care why I per­ceive my­self to ex­pe­rience things. If it’s an illu­sion, cool, but then why do I ex­pe­rience the illu­sion?

• If be­lief is con­strued as some sort of rep­re­sen­ta­tion which stands for ex­ter­nal re­al­ity (as in the case of some cor­re­spon­dence the­o­ries of truth), then we can take the claim to be strong pre­dic­tion of con­tem­po­rary neu­ro­science. Ditto for whether we can ex­plain why we talk about qualia.

It’s not that I could ex­plain ex­actly why you in par­tic­u­lar talk about qualia. It’s that we have an es­tab­lished paradigm for ex­plain­ing it.

It’s similar in the re­spect that we have an es­tab­lished paradigm for ex­plain­ing why peo­ple re­port be­ing able to see color. We can model the eye, and the vi­sual cor­tex, and we have some idea of what neu­rons do even though we lack the spe­cific in­for­ma­tion about how the whole thing fits to­gether. And we could imag­ine that in the limit of perfect neu­ro­science, we could syn­the­size this in­for­ma­tion to trace back the rea­son why you said a par­tic­u­lar thing.

Since we do not have perfect neu­ro­science, the best anal­ogy would be an­a­lyz­ing the ‘be­liefs’ and pre­dic­tions of an ar­tifi­cial neu­ral net­work. If you asked me, “Why does this ANN pre­dict that this image is a 5 with 98% prob­a­bil­ity” it would be difficult to say ex­actly why, even with full ac­cess to the neu­ral net­work pa­ram­e­ters.

How­ever, we know that un­less our con­cep­tion of neu­ral net­works is com­pletely in­cor­rect, in prin­ci­ple we could trace ex­actly why the neu­ral net­work made that judge­ment, in­clud­ing the ex­act steps that caused the neu­ral net­work to have the pa­ram­e­ters that it has in the first place. And we know that such an ex­pla­na­tion re­quires only the com­po­nents which make up the ANN, and not any con­scious or phe­nom­e­nal prop­er­ties.

• I can’t tell whether we’re ar­gu­ing about the same thing.

Like, I as­sume that I am a neu­ral net pre­dict­ing things and de­cid­ing things and if you had full ac­cess to my brain you could (in prin­ci­ple, given suffi­cient time) un­der­stand ev­ery­thing that was go­ing on in there. But, like, one way or an­other I ex­pe­rience the per­cep­tion of per­ceiv­ing things.

(I’d pre­fer to taboo ‘Qualia’ in case it has par­tic­u­lar con­no­ta­tions I don’t share. Just ‘that thing where Ray per­ceives him­self per­ceiv­ing things, and per­haps the part where some­times Ray has prefer­ences about those per­cep­tions of per­ceiv­ing be­cause the per­cep­tions have valence.’ If that’s what Qualia means, cool, and if it means some other thing I’m not sure I care)

My cur­rent work­ing model of “how this as­pect of my per­cep­tion works” is de­scribed in this com­ment, I guess easy enough to quote in full:

“Hu­man brains con­tain two forms of knowl­edge: - ex­plicit knowl­edge and weights that are used in im­plicit knowl­edge (ad­mit­tedly the former is hacked on top of the later, but that isn’t rele­vant here). Mary doesn’t gain any ex­tra ex­plicit knowl­edge from see­ing blue, but her brain changes some of her im­plicit weights so that when a blue ob­ject ac­ti­vates in her vi­sion a sub-neu­ral net­work can con­nect this to the la­bel “blue”.”

The rea­son I care about any of this is that I be­lieve that a “per­cep­tions-hav­ing-valence” is prob­a­bly morally rele­vant. (or, put in usual terms: suffer­ing and plea­sure seem morally rele­vant).

(I think it’s quite pos­sibe that fu­ture-me will de­cide I was con­fused about this part, but it’s the part I care about any­how)

Are you say­ing the my per­ceiv­ing-that-I-per­ceive-things-with-valence is an illu­sion, and that I am in fact not do­ing that? Or some other thing?

(To be clear, I AM open to ‘ac­tu­ally Ray yes, the coun­ter­in­tu­itive an­swer is that no, you’re not ac­tu­ally per­ceiv­ing-that-you-per­ceive-things-and-some-of-the-per­cep­tions-have-valence.’ The topic is clearly con­fus­ing and be­hind the veil of epistemic-ig­no­rance it seems quite plau­si­ble I’m the con­fused one here. Just not­ing that so far that from way you’re phras­ing things I can’t tell whether your claims map onto the things I care about )

• Like, I as­sume that I am a neu­ral net pre­dict­ing things and de­cid­ing things and if you had full ac­cess to my brain you could (in prin­ci­ple, given suffi­cient time) un­der­stand ev­ery­thing that was go­ing on in there. But, like, one way or an­other I ex­pe­rience the per­cep­tion of per­ceiv­ing things.

To me this is a bit like the claim of some­one who claimed psy­chic pow­ers but still wanted to be­lieve in physics who would say, “I as­sume you could perfectly well un­der­stand what was go­ing on at a be­hav­ioral level within my brain, but there is still a da­tum left un­ex­plained: the da­tum of me hav­ing psy­chic pow­ers.”

There are a num­ber of ways to re­spond to the claim:

• We could re­define psy­chic pow­ers to in­clude mere phys­i­cal prop­er­ties. This has the prob­lem that psy­chics in­sist that psy­chic power is en­tirely sep­a­rate from phys­i­cal prop­er­ties. Sim­ple re-defi­ni­tion doesn’t make the in­tu­ition go away and doesn’t ex­plain any­thing.

• We could al­ter­na­tively posit new physics which in­cor­po­rates psy­chic pow­ers. This has the oc­ca­sional prob­lem that it vi­o­lates Oc­cam’s ra­zor, since the old physics was com­pletely ad­e­quate. Hence the de­bunk­ing ar­gu­ment I pre­sented above.

• Or, we could in­cor­po­rate the phe­nomenon within a phys­i­cal model by first deny­ing that it ex­ists and then ex­plain­ing the mechanism which caused you to be­lieve in it, and talk about it.

In the case of con­scious­ness, the third re­sponse amounts to Illu­sion­ism, which is the view that I am defend­ing. It has the ad­van­tage that it con­ser­va­tively doesn’t promise to con­tra­dict known physics, and it also does jus­tice to the in­tu­ition that con­scious­ness re­ally ex­ists.

I’d pre­fer to taboo ‘Qualia’ in case it has par­tic­u­lar con­no­ta­tions I don’t share. Just ‘that thing where Ray per­ceives him­self per­ceiv­ing things, and per­haps the part where some­times Ray has prefer­ences about those per­cep­tions of per­ceiv­ing be­cause the per­cep­tions have valence.’

To most philoso­phers who write about it, qualia is defined as the ex­pe­rience of what it’s like. Roughly speak­ing, I agree with think­ing of it as a par­tic­u­lar form of per­cep­tion that we ex­pe­rience.

How­ever, it’s not just any per­cep­tion, since some per­cep­tions can be un­con­scious per­cep­tions. Qualia speci­fi­cally re­fer to the qual­i­ta­tive as­pects of our ex­pe­rience of the world: the taste of wine, the touch of fabric, the feel­ing of see­ing blue, the suffer­ing as­so­ci­ated with phys­i­cal pain etc. Th­ese are said to be di­rectly ap­pre­hen­si­ble to our ‘in­ter­nal movie’ that is play­ing in­side our head. It is this type of prop­erty which I am ap­ply­ing the frame­work of illu­sion­ism to.

The rea­son I care about any of this is that I be­lieve that a “per­cep­tions-hav­ing-valence” is prob­a­bly morally rele­vant.

I agree. That’s why I typ­i­cally take the view that con­scious­ness is a pow­er­ful illu­sion, and that we should take it se­ri­ously. Those who sim­ply re-define con­scious­ness as es­sen­tially a syn­onym for “per­cep­tion” or “ob­ser­va­tion” or “in­for­ma­tion” are not do­ing jus­tice to the fact that it’s the thing I care about in this world. I have a strong in­tu­ition that con­scious­ness is what is valuable even de­spite the fact that I hold an illu­sion­ist view. To put it an­other way, I would care much less if you told me a com­puter was re­ceiv­ing a pain-sig­nal (la­beled in the code as some vari­able with suffer­ing set to max­i­mum), com­pared to the claim that a com­puter was ac­tu­ally suffer­ing in the same way a hu­man does.

Are you say­ing the my per­ceiv­ing-that-I-per­ceive-things-with-valence is an illu­sion, and that I am in fact not do­ing that? Or some other thing?

Roughly speak­ing, yes. I am deny­ing that that type of thing ac­tu­ally ex­ists, in­clud­ing the valence claim.

• Or, we could in­cor­po­rate the phe­nomenon within a phys­i­cal model by first deny­ing that it ex­ists and then ex­plain­ing the mechanism which caused you to be­lieve in it, and talk about it.

It still feels very im­por­tant that you haven’t ac­tu­ally ex­plained this.

In the case of psy­chic pow­ers, I (think?) we ac­tu­ally have pretty good ex­pla­na­tions for where per­cep­tions of psy­chic pow­ers comes from, which makes the per­cep­tion of psy­chic pow­ers non-mys­te­ri­ous. (i.e. we know how cold read­ing works, and how var­i­ous kinds of con­fir­ma­tion bias play into div­ina­tion). But, that was some­thing that ac­tu­ally had to be ex­plained.

It feels like you’re just chang­ing the name of the con­fus­ing thing from ‘the fact that I seem con­scious to my­self’ to ‘the fact that I’m ex­pe­rienc­ing an illu­sion of con­scious­ness.’ Cool, but, like, there’s still a mys­te­ri­ous thing that seems quite im­por­tant to ac­tu­ally ex­plain.

• Also just in gen­eral, I dis­agree that skep­ti­cism is not progress. If I said, “I don’t be­lieve in God be­cause there’s noth­ing in the uni­verse with those prop­er­ties...” I don’t think it’s fair to say, “Cool, but like, I’m still pray­ing to some­thing right, and that needs to be ex­plained” be­cause I don’t think that speaks fully to what I just de­nied.

In the case of re­li­gion, many peo­ple have a very strong in­tu­ition that God ex­ists. So, is the athe­ist po­si­tion not progress be­cause we have not ex­plained this in­tu­ition?

• I agree that skep­ti­cism gen­er­ally can be im­por­tant progress (I re­cently stum­bled upon this old com­ment mak­ing a similar ar­gu­ment about how say­ing “not X” can be use­ful)

The differ­ence be­tween God and con­scious­ness is that the in­ter­est­ing bit about con­scious­ness *is* my per­cep­tion of it, full stop. Un­like God or psy­chic pow­ers, there is no sep­a­rate thing from my per­cep­tion of it that I’m in­ter­ested in.

• The differ­ence be­tween God and con­scious­ness is that the in­ter­est­ing bit about con­scious­ness *is* my per­cep­tion of it, full stop.

If by per­cep­tion you sim­ply mean “You are an in­for­ma­tion pro­cess­ing de­vice that takes sig­nals in and out­puts things” then this is en­tirely ex­pli­ca­ble on our cur­rent phys­i­cal mod­els, and I could dis­solve the con­fu­sion fairly eas­ily.

How­ever, I think you have some­thing else in mind which is that there is some­how some­thing left out when I ex­plain it by sim­ply ap­peal­ing to sig­nal pro­cess­ing. In that sense, I think you are fal­ling right into the trap! You would be do­ing some­thing similar to the per­son who said, “But I am still pray­ing to God!

• How­ever, I think you have some­thing else in mind which is that there is some­how some­thing left out when I ex­plain it by sim­ply ap­peal­ing to sig­nal pro­cess­ing. In that sense,

I don’t have any­thing else in mind that I know of. “Ex­plained via sig­nal pro­cess­ing” seems ba­si­cally suffi­cent. The in­ter­est­ing part is “how can you look at a given sig­nal-pro­cess­ing-sys­tem, and pre­dict in ad­vance whether that sys­tem is the sort of thing that would talk* about Qualia, if it could talk?”

(I feel like this was all cov­ered in the se­quences, ba­si­cally?)

*where “talk about qualia” is short­hand ‘would con­sider the con­cept of qualia im­por­tant enough to have a con­cept for.’”

• I mean, I agree that this was mostly cov­ered in the se­quences. But I also think that I dis­agree with the way that most peo­ple frame the de­bate. At least per­son­ally I have seen peo­ple who I know have read the se­quences still make ba­sic er­rors. So I’m just leav­ing this here to ex­plain my point of view.

In­tu­ition: On a first ap­prox­i­ma­tion, there is some­thing that it is like to be us. In other words, we are be­ings who have qualia.

Coun­ter­in­tu­ition: In or­der for qualia to ex­ist, there would need to ex­ist en­tities which are pri­vate, in­ef­fable, in­trin­sic, sub­jec­tive and this can’t be since physics is pub­lic, ef­fable, and ob­jec­tive and there­fore con­tra­dicts the ex­is­tence of qualia.

In­tu­ition: But even if I agree with you that qualia don’t ex­ist, there still seems to be some­thing left un­ex­plained.

Coun­ter­in­tu­ition: We can ex­plain why you think there’s some­thing un­ex­plained be­cause we can ex­plain the cause of your be­lief in qualia, and why you think they have these prop­er­ties. By ex­plain­ing why you be­lieve it we have ex­plained all there is to ex­plain.

In­tu­ition: But you have merely said that we could ex­plain it. You have not have ac­tu­ally ex­plained it.

Coun­ter­in­tu­ition: Even with­out the pre­cise ex­pla­na­tion, we now have a paradigm for ex­plain­ing con­scious­ness, so it is not mys­te­ri­ous any­more.

This is es­sen­tially the point where I leave.

• physics is pub­lic, ef­fable, and ob­jec­tive and there­fore con­tra­dicts the ex­is­tence of qualia.

Physics as map is. Note that we can’t com­pare the map di­rectly to the ter­ri­tory.

• We do not tele­path­i­cally re­ceive ex­periemnt re­sults when they are performed. In re­al­ity you need ot in­take the mea­sum­rent re­sults from your first-per­son point of view (use eyes to read led screen or use ears to hear about sto­ries of ex­per­i­ments performed). It seems to be taht ex­per­i­ments are in­ter­sub­jec­tive in that other ob­servers will re­port hav­ing ex­pe­riences that re­sem­ble my first-hand ex­pe­riences. For most pur­poses short­hand­ing this to “pub­lic” is ad­e­quate enough. But your point of view is “un­pub­lis­able” in that even if you re­ally tried there is no way to provide you pri­vate ex­pereience to the pub­lic knowl­edge pool (“di­rectly”). “I now how you feel” is a fic­tion it doesn’t ac­tu­ally hap­pen.

Skep­tisim about the ex­pe­rienc­ing of oth­ers is eas­ier but be­ing skep­ti­cal about your own ex­pe­riences would seem to be lu­dicrous.

• I am not deny­ing that hu­mans take in sen­sory in­put and pro­cess it us­ing their in­ter­nal neu­ral net­works. I am deny­ing that pro­cess has any of the prop­er­ties as­so­ci­ated with con­scious­ness in the philo­soph­i­cal sense. And I am mak­ing an ad­di­tional claim which is that if you merely re­define con­scious­ness so that it lacks these philo­soph­i­cal prop­er­ties, you have not ac­tu­ally ex­plained any­thing or dis­solved any con­fu­sion.

The illu­sion­ist ap­proach is the best ap­proach be­cause it si­mul­ta­neously takes con­scious­ness se­ri­ously and doesn’t con­tra­dict physics. By tak­ing this ap­proach we also have an un­der­stood paradigm for solv­ing the hard prob­lem of con­scious­ness: namely, the hard prob­lem is re­duced to the meta-prob­lem (see Chalmers).

• It feels like you’re just chang­ing the name of the con­fus­ing thing from ‘the fact that I seem con­scious to my­self’ to ‘the fact that I’m ex­pe­rienc­ing an illu­sion of con­scious­ness.’ Cool, but, like, there’s still a mys­te­ri­ous thing that seems quite im­por­tant to ac­tu­ally ex­plain.

I don’t ac­tu­ally agree. Although I have not fully ex­plained con­scious­ness, I think that I have shown a lot.

In par­tic­u­lar, I have shown us what the solu­tion to the hard prob­lem of con­scious­ness would plau­si­bly look like if we had un­limited fund­ing and time. And to me, that’s im­por­tant.

And un­der my view, it’s not go­ing to look any­thing like, “Hey we dis­cov­ered this mechanism in the brain that gives rise to con­scious­ness.” No, it’s go­ing to look more like, “Look at this mechanism in the brain that makes hu­mans talk about things even though the things they are talk­ing about have no real world refer­ent.”

You might think that this is a use­less achieve­ment. I claim the con­trary. As Chalmers points out, pretty much all the lead­ing the­o­ries of con­scious­ness fail the ba­sic test of look­ing like an ex­pla­na­tion rather than just sound­ing con­fused. Don’t be­lieve me? Read Sec­tion 3 in this pa­per.

In short, Chalmers re­views the cur­rent state of the art in con­scious­ness ex­pla­na­tions. He first goes into In­te­grated In­for­ma­tion The­ory (IIT), but then con­vinc­ingly shows that IIT fails to ex­plain why we would talk about con­scious­ness and be­lieve in con­scious­ness. He does the same for global workspace the­o­ries, first or­der rep­re­sen­ta­tional the­o­ries, higher or­der the­o­ries, con­scious­ness-causes-col­lapse the­o­ries, and panpsy­chism. Sim­ply put, none of them even ap­proach an ad­e­quate baseline of look­ing like an ex­pla­na­tion.

I also be­lieve that if you fol­low my view care­fully you might stop be­ing con­fused about a lot of things. Like, do an­i­mals feel pain? Well it de­pends on your defi­ni­tion of pain—con­scious­ness is not real in any ob­jec­tive sense so this is a defi­ni­tion dis­pute. Same with ask­ing whether per­son A is hap­pier than per­son B, or ask­ing whether com­put­ers will ever be con­scious.

Per­haps this isn’t an achieve­ment strictly speak­ing rel­a­tive to the stan­dard Less­wrong points of view. But that’s only be­cause I think the stan­dard Less­wrong point of view is cor­rect. Yet even so, I still see peo­ple around me mak­ing fun­da­men­tally ba­sic mis­takes about con­scious­ness. For in­stance, I see peo­ple treat­ing con­scious­ness as in­trin­sic, in­ef­fable, pri­vate—or they think there’s an ob­jec­tively right an­swer to whether an­i­mals feel pain and ar­gue over this as if it’s not the same as a tree fal­ling in a for­est.

• And we know that such an ex­pla­na­tion re­quires only the com­po­nents which make up the ANN, and not any con­scious or phe­nom­e­nal prop­er­ties.

That’s an ar­gu­ment against du­al­ism not an ar­gu­ment against qualia. If mind brain iden­tity is true, neu­ral ac­tivity is caus­ing re­ports, and qualia, along with the rest of con­scious­ness are iden­ti­cal to neu­ral ac­tivity, so qualia are also caus­ing re­ports.

• If you iden­tify qualia as be­hav­ioral parts of our phys­i­cal mod­els, then are you also will­ing to dis­card the prop­er­ties philoso­phers have as­so­ci­ated with qualia, such as

• Inef­fable, as they can’t be ex­plained us­ing just words or math­e­mat­i­cal sentences

• Pri­vate, as they are in­ac­cessible to out­side third-per­son observers

• In­trin­sic, as they are fun­da­men­tal to the way we ex­pe­rience the world

If you are will­ing to dis­card these prop­er­ties, then I sug­gest we stop us­ing the world “qualia” since you have sim­ply taken all the mean­ing away once you have iden­ti­fied them with things that ac­tu­ally ex­ist. This is what I mean when I say that I am deny­ing qualia.

It is analo­gous to some­one who de­nies that souls ex­ist by first con­ced­ing that we could iden­tify cer­tain phys­i­cal con­figu­ra­tions as ex­am­ples of souls, but then ex­plain­ing that this would be con­fus­ing to any­one who talks about souls in the tra­di­tional sense. Far bet­ter in my view to dis­card the idea al­to­gether.

• My ori­en­ta­tion to this con­ver­sa­tion seems more like “hmm, I’m learn­ing that it is pos­si­ble the word qualia has a bunch of con­no­ta­tions that I didn’t know it had”, as op­posed to “hmm, I was wrong to be­lieve in the-thing-I-was-call­ing-qualia.”

But I’m not yet sure that these con­no­ta­tions are ac­tu­ally uni­ver­sal – the wikipe­dia ar­ti­cle opens with:

In philos­o­phy and cer­tain mod­els of psy­chol­ogy, qualia (/​ˈk­wɑːliə/​ or /​ˈk­weɪliə/​; sin­gu­lar form: quale) are defined as in­di­vi­d­ual in­stances of sub­jec­tive, con­scious ex­pe­rience. The term qualia de­rives from the Lat­in­neuter plu­ral form (qualia) of the Latin ad­jec­tive quālis (Latin pro­nun­ci­a­tion: [ˈkʷaːlɪs]) mean­ing “of what sort” or “of what kind” in a spe­cific in­stance, like “what it is like to taste a spe­cific ap­ple, this par­tic­u­lar ap­ple now”.
Ex­am­ples of qualia in­clude the per­ceived sen­sa­tion of pain of a headache, the taste of wine, as well as the red­ness of an evening sky. As qual­i­ta­tive char­ac­ters of sen­sa­tion, qualia stand in con­trast to “propo­si­tional at­ti­tudes”,[1] where the fo­cus is on be­liefs about ex­pe­rience rather than what it is di­rectly like to be ex­pe­rienc­ing.
Philoso­pher and cog­ni­tive sci­en­tist Daniel Den­nett once sug­gested that qualia was “an un­fa­mil­iar term for some­thing that could not be more fa­mil­iar to each of us: the ways things seem to us”.[2]
Much of the de­bate over their im­por­tance hinges on the defi­ni­tion of the term, and var­i­ous philoso­phers em­pha­size or deny the ex­is­tence of cer­tain fea­tures of qualia. Con­se­quently, the na­ture and ex­is­tence of var­i­ous defi­ni­tions of qualia re­main con­tro­ver­sial be­cause they are not ver­ifi­able.

Later on, it notes the three char­ac­ter­is­tics (in­ef­fable/​pri­vate/​in­trin­sic) that Den­nett listed.

But this looks more like an ac­ci­dent of his­tory than some­thing in­trin­sic to the term. The open­ing para­graphs defined qualia the way I naively ex­pected it to be defined.

My im­pres­sion look­ing at the var­i­ous defin­tions and dis­cus­sion is not that qualia was defined in this spe­cific fash­ion, so much as var­i­ous peo­ple try­ing to grap­ple with a con­fus­ing prob­lem gen­er­ated var­i­ous pos­si­ble defi­ni­tions and rules for it, and some of those turned out to be false once we came up with bet­ter un­der­stand­ing.

I can see where you’re com­ing from with the soul anal­ogy, but I’m not sure if it’s more like the soul anal­ogy, or more like “One early philoso­pher defined ‘a hu­man’ as a feather­less biped, and then a later one said “dude, look at this feather­less chicken I just made” and they re­al­ized the defi­ni­tion was silly.

I guess my ques­tion here is – do you have a sug­ges­tion for a re­place­ment word for “the par­tic­u­lar kind of ob­ser­va­tion that gets made by an en­tity that ac­tu­ally gets to ex­pe­rience the per­cep­tion”? This still seems im­por­tantly differ­ent from “just a per­cep­tion”, since very sim­ple robots and ther­mostats or what­ever can be said to have those. I don’t re­ally care whether they are in­her­ently pri­vate, in­ef­fable or in­trin­sic, and whether Daniel Den­nett was able to eff them seems more like a his­tor­i­cal cu­ri­os­ity to me.

The wikipe­dia ar­ti­cle speci­fi­cally says that they peo­ple ar­gue a lot over the defi­ni­tions:

There are many defi­ni­tions of qualia, which have changed over time. One of the sim­pler, broader defi­ni­tions is: “The ‘what it is like’ char­ac­ter of men­tal states. The way it feels to have men­tal states such as pain, see­ing red, smelling a rose, etc.”

That defi­ni­tion there is the one I’m gen­er­ally us­ing, and the one which seems im­por­tant to have a word for. This seems more like a poli­ti­cal/​co­or­di­na­tion ques­tion of “is it eas­ier to in­vent a new word and gain trac­tion for it, or get ev­ery­one on page about ‘ac­tu­ally, they’re to­tally in prin­ci­ple ef­fable, you just might need to be a kind of mind differ­ent than a cur­rent-gen­er­a­tion-hu­man to prop­erly eff them.’

• It does seem to me some­thing like “I ex­pect the sort of mind that is ca­pa­ble of view­ing qualia of other peo­ple would be suffi­ciently differ­ent from a hu­man mind that it may still be fair to call them ‘pri­vate/​in­ef­fable among hu­mans.’”

• Thanks for en­gag­ing with me on this thing. :)

I know I’m not be­ing as clear as I could pos­si­bly be, and at some points I sort of feel like just throw­ing “Quin­ing Qualia” or Keith Frank­ish’s ar­ti­cles or a whole bunch of other blog posts at peo­ple and say, “Please just read this and re-read it un­til you have a very dis­tinct in­tu­ition about what I am say­ing.” But I know that that type of de­bate is not helpful.

I think I have a OK-to-good un­der­stand­ing of what you are say­ing. My model of your re­ply is some­thing like this,

“Your claim is that qualia don’t ex­ist be­cause noth­ing with these three prop­er­ties ex­ists (in­ef­fa­bil­ity/​pri­vate/​in­trin­sic), but it’s not clear to me that these three prop­er­ties are uni­ver­sally iden­ti­fied with qualia. When I go to Wikipe­dia or other sources, they usu­ally iden­tify qualia with ‘what it’s like’ rather than these three very spe­cific things that Daniel Den­nett hap­pened to list once. So, I still think that I am point­ing to some­thing real when I talk about ‘what it’s like’ and you are only dis­put­ing a per­haps-straw­man ver­sion of qualia.”

Please cor­rect me if this model of you is in­ac­cu­rate.

I rec­og­nize what you are say­ing, and I agree with the place you are com­ing from. I re­ally do. And fur­ther­more, I re­ally re­ally agree with the idea that we should go fur­ther than skep­ti­cism and we should always ask more ques­tions even af­ter we have con­cluded that some­thing doesn’t ex­ist.

How­ever, the place I get off the boat is where you keep talk­ing about how this ‘what it’s like’ thing is ac­tu­ally refer­ring to some­thing co­her­ent in the real world that has a crisp, nat­u­ral bound­ary around it. That’s the dis­agree­ment.

I don’t think it’s an ac­ci­dent of his­tory ei­ther that those prop­er­ties are iden­ti­fied with qualia. The whole rea­son Daniel Den­nett iden­ti­fied them was be­cause he showed that they were the nec­es­sary con­clu­sion of the sort of thought ex­per­i­ments peo­ple use for qualia. He spends the whole first sev­eral para­graphs jus­tify­ing them us­ing var­i­ous in­tu­ition pumps in his es­say on the mat­ter.

Point be­ing, when you are asked to clar­ify what ‘what it’s like’ means, you’ll prob­a­bly start point­ing to ex­am­ples. Like, you might say, “Well, I know what it’s like to see the color green, so that’s an ex­am­ple of a quale.” And Daniel Den­nett would then press the per­son fur­ther and go, “OK could you clar­ify what you mean when you say you ‘know what it’s like to see green’?” and the per­son would say, “No, I can’t de­scribe it us­ing words. And it’s not clear to me it’s even in the same cat­e­gory of things that can be ei­ther, since I can’t pos­si­bly con­ceive of an English sen­tence that would de­scribe the color green to a blind per­son.” And then Daniel Den­nett would shout, “Aha! So you do be­lieve in in­ef­fa­bil­ity!”

The point of those three prop­er­ties (ac­tu­ally he lists 4, I think), is not that they are in­her­ently tied to the defi­ni­tion. It’s that the defi­ni­tion is vague, and ev­ery time peo­ple are pressed to be more clear on what they mean, they start spout­ing non­sense. Den­nett did valid and good de­con­fu­sion work where he showed that peo­ple go wrong in these four places, and then showed how there’s no phys­i­cal thing that could pos­si­bly al­low those four things.

Th­ese prop­er­ties also show up all over the var­i­ous thought ex­per­i­ments that peo­ple use when talk­ing about qualia. For ex­am­ple, Nagel uses the pri­vate prop­erty in his es­say “What Is it Like to Be a Bat?” Chalmers uses the in­trin­sic prop­erty when he talks about p-zom­bies be­ing phys­i­cally iden­ti­cal to hu­mans in ev­ery re­spect ex­cept for qualia. Frank Jack­son used the in­ef­fa­bil­ity prop­erty when he talked about how Mary the neu­ro­scien­tist had some­thing miss­ing when she was in the black and white room.

All of this is im­por­tant to rec­og­nize. Be­cause if you still want to say, “But I’m still point­ing to some­thing valid and real even if you want to re­ject this other straw­man-en­tity” then I’m go­ing to treat you like the per­son who wants to be­lieve in souls even af­ter they’ve been shown that noth­ing soul-like ex­ists in this uni­verse.

• Spout­ing non­sense is differ­ent from be­ing wrong. If I say that there are no rec­t­an­gles with 5 an­gles that can be pro­cessed pretty straght for­wardly be­cause the con­cept of a rec­t­an­gle is un­prob­le­matic. But if you seek why that state­ment was made and the per­son points to a pen­tagon you will find 5 an­gles. Now there are poly­gons with 5 an­gles. If you give a short word for 5 an­gle rec­t­an­gle” it’s cor­rect to say those don’t ex­ists. But if you give an os­ten­sive defi­ni­tion of the shape then it does ex­ist and it’s more to the point to say that it’s not a rec­t­an­gle rather that it doesn’t ex­ist.

In the de­tails when per­sons say “what it is like to see green” one could fail to get what they mean or point to. If some­one says “look a uni­corn” and one has proof that uni­corns don’t ex­ist that doesn’t mean that the uni­corn refer­ence is not refer­enc­ing some­thing or that the refer­ence tar­get does not ex­ist. If you end up in a situ­a­tion where you point at a horse and say “those things do not ex­ist. Look no horn, doesn’t ex­ist” you are not be­ing helpful. If some­body is point­ing to a horse and says “look, a uni­corn!” and you go “where? I see only horses” you are also not be­ing helpful. Be­ing “mo­ti­vat­edly un­co­op­er­a­tive in os­ten­sion re­ceiv­ing” is not cool. Say that you made a deal to sell a gold bar in ex­change for a uni­corn. Then re­fus­ing to ac­cept any ob­ject as an uni­corn woud let you keep your gold bar and you migth be tempted to play dumb.

When peo­ple are say­ing “what it feels like to see green” they are try­ing to com­mu­ni­cate some­thing and failing their as­ser­tion by sab­o­tag­ing their com­mu­ni­ca­tion doesn’t prove any­thing. Com­mu­ni­ca­tion is hard yes but do­ing too much se­man­tics sub­sti­tu­tion means you start talk­ing past each other.

• I am not sug­gest­ing that qualia should be iden­ti­fied with neu­ral ac­tivity in a way that loses any as­pects of the philo­soph­i­cal defi­ni­tion… bear­ing in mind that the he philo­soph­i­cal defi­ni­tion does not as­sert that qualia are non phys­i­cal.

• What are you ex­pe­rienc­ing right now? (E.g. what do you see in front of you? In what sense does it seem to be there?)

• I won’t lie—I have a very strong in­tu­ition that there’s this vi­sual field in front of me, and that I can hear sounds that have dis­tinct qual­ities, and si­mul­ta­neously I can feel thoughts rush into my head as if there is an in­ter­nal speaker and listener. And when I re­flect on some vi­sual in the dis­tance, it seems as though the col­ors are very crisp and ex­ist in some way in­de­pen­dent of sim­ple in­for­ma­tion pro­cess­ing in a com­puter-type de­vice. It all seems very real to me.

I think the main claim of the illu­sion­ist is that these in­tu­itions (at least in­so­far as the in­tu­itions are mak­ing claims about the prop­er­ties of qualia) are just rad­i­cally in­cor­rect. It’s as if our brains have an in­ter­nal er­ror in them, not al­low­ing us to un­der­stand the true na­ture of these en­tities. It’s not that we can’t see or some­thing like that. It’s just that the qual­ity of per­ceiv­ing the world has es­sen­tially an iden­ti­cal struc­ture to what one might imag­ine a com­puter with a cam­era would “see.”

Anal­ogy: Some peo­ple who claim to have ex­pe­rienced heaven aren’t just mak­ing stuff up. In some sense, their per­cep­tion is real. It just doesn’t have the prop­er­ties we would ex­pect it to have at face value. And if we ac­tu­ally tried look­ing for heaven in the phys­i­cal world we would find it to be lit­tle else than an illu­sion.

• What’s the differ­ence be­tween mak­ing claims about nearby ob­jects and mak­ing claims about qualia (if there is one)? If I say there’s a book to my left, is that say­ing some­thing about qualia? If I say I dreamt about a rab­bit last night, is that say­ing some­thing about qualia?

(Are claims of the form “there is a book to my left” rad­i­cally in­cor­rect?)

That is, is there a way to dis­t­in­guish claims about qualia from claims about lo­cal stuff/​phe­nom­ena/​etc?

• Sure. There are a num­ber of prop­er­ties usu­ally as­so­ci­ated with qualia which are the things I deny. If we strip these prop­er­ties away (some­thing Kieth Frank­ish refers to as zero qualia) then we can still say that they ex­ist. But it’s con­fus­ing to say that some­thing ex­ists when its prop­er­ties are so min­i­mal. Daniel Den­nett listed a num­ber of prop­er­ties that philoso­phers have as­signed to qualia and con­scious ex­pe­rience more gen­er­ally:

(1) in­ef­fable (2) in­trin­sic (3) pri­vate (4) di­rectly or im­me­di­ately ap­pre­hen­si­ble

Inef­fable be­cause there’s some­thing Mary the neu­ro­scien­tist is miss­ing when she is in the black and white room. And some­one who tried ex­plain­ing color to her would not be able to fully.

In­trin­sic be­cause it can­not be re­duced to bare phys­i­cal en­tities, like elec­trons (think: could you con­struct a quale if you had the right set of par­ti­cles?).

Pri­vate be­cause they are ac­cessible to us and not globally available. In this sense, if you tried to find out the qualia that a mouse was ex­pe­rienc­ing as it fell vic­tim to a trap, you would come up fun­da­men­tally short be­cause it was spe­cific to the mouse mind and not yours. Or as Nagel put it, there’s no way that third per­son sci­ence could dis­cover what it’s like to be a bat.

Directly ap­pre­hen­si­ble be­cause they are the el­e­men­tary things that make up our ex­pe­rience of the world. Look around and qualia are just what you find. They are the build­ing blocks of our per­cep­tion of the world.

It’s not nec­es­sar­ily that none of these prop­er­ties could be steel­manned. It is just that they are so far from be­ing steel­mannable that it is bet­ter to deny their ex­is­tence en­tirely. It is the same as my anal­ogy with a per­son who claims to have vis­ited heaven. We could ei­ther talk about it as illu­sory or non-illu­sory. But for prac­ti­cal pur­poses, if we chose the non-illu­sory route we would prob­a­bly be quite con­fused. That is, if we tried find­ing heaven in­side the phys­i­cal world, with the same prop­er­ties as the claimant had pro­posed, then we would come up short. Far bet­ter then, to treat it as a mis­take in­side of our cog­ni­tive hard­ware.

• Thanks for the elab­o­ra­tion. It seems to me that ex­pe­riences are:

1. Hard-to-eff, as a good-enough the­ory of what phys­i­cal struc­tures have which ex­pe­riences has not yet been dis­cov­ered, and would take philo­soph­i­cal work to dis­cover.

2. Hard to re­duce to physics, for the same rea­son.

3. In prac­tice pri­vate due to mind-read­ing tech­nol­ogy not hav­ing been de­vel­oped, and due to band­width and mem­ory limi­ta­tions in hu­man com­mu­ni­ca­tion. (It’s also hard to imag­ine what sort of tech­nol­ogy would al­low repli­cat­ing the ex­pe­rience of be­ing a mouse)

4. Pretty di­rectly ap­pre­hen­si­ble (what else would be? If noth­ing is, what do we build the­o­ries out of?)

It seems nat­u­ral to con­clude from this that:

1. Phys­i­cal things ex­ist.

2. Ex­pe­riences ex­ist.

3. Ex­pe­riences prob­a­bly su­per­vene on phys­i­cal things, but the su­per­ve­nience re­la­tion is not yet de­ter­mined, and de­ter­min­ing it re­quires philo­soph­i­cal work.

4. Given that we don’t know the su­per­ve­nience re­la­tion yet, we need to at least pro­vi­sion­ally have ex­pe­riences in our on­tol­ogy dis­tinct from phys­i­cal en­tities. (It is, af­ter all, im­pos­si­ble to do physics with­out mak­ing ob­ser­va­tions and re­port­ing them to oth­ers)

Is there some­thing I’m miss­ing here?

• Here’s a thought ex­per­i­ment which helped me lose my ‘be­lief’ in qualia: would a robot sci­en­tist, who was only de­signed to study physics and make pre­dic­tions about the world, ever in­vent qualia as a hy­poth­e­sis?

As­sum­ing the ac­tual mouth move­ments we make when we say things like, “Qualia ex­ist” are ex­plain­able via the sci­en­tific method, the robot sci­en­tist could still pre­dict that we would talk and write about con­scious­ness. But would it posit con­scious­ness as a sep­a­rate en­tity al­to­gether? Would it treat con­scious­ness as a deep mys­tery, even af­ter peer­ing into our brains and find­ing noth­ing but elec­tri­cal im­pulses?

• Robots take in ob­ser­va­tions. They make the­o­ries that ex­plain their ob­ser­va­tions. Differ­ent robots will make differ­ent ob­ser­va­tions and com­mu­ni­cate them to each other. Thus, they will talk about ob­ser­va­tions.

After mak­ing enough ob­ser­va­tions they make the­o­ries of physics. (They had to talk about ob­ser­va­tions be­fore they made low-level physics the­o­ries, though; af­ter all, they came to the­o­rize about physics through their ob­ser­va­tions). They also make bridge laws ex­plain­ing how their ob­ser­va­tions are re­lated to physics. But, they have un­cer­tainty about these bridge laws for a sig­nifi­cant time pe­riod.

The robots the­o­rize that hu­mans are similar to them, based on the fact that they have func­tion­ally similar cog­ni­tive ar­chi­tec­ture; thus, they the­o­rize that hu­mans have ob­ser­va­tions as well. (The bridge laws they posit are sym­met­ric that way, rather than be­ing sili­con-chau­vinist)

• I think you are us­ing the word “ob­ser­va­tion” to re­fer to con­scious­ness. If this is true, then I do not deny that hu­mans take in ob­ser­va­tions and pro­cess them.

How­ever, I think the is­sue is that you have sim­ply re-defined con­scious­ness into some­thing which would be un­rec­og­niz­able to the philoso­pher. To that ex­tent, I don’t say you are wrong, but I will allege that you have not done enough to re­spond to the con­scious­ness-re­al­ist’s in­tu­ition that con­scious­ness is differ­ent from phys­i­cal prop­er­ties. Let me ex­plain:

If qualia are just ob­ser­va­tions, then it seems ob­vi­ous that Mary is not miss­ing any in­for­ma­tion in her room, since she can perfectly well un­der­stand and model the pro­cess by which peo­ple re­ceive color ob­ser­va­tions.

Like­wise, if qualia are merely ob­ser­va­tions, then the Zom­bie ar­gu­ment amounts to say­ing that p-Zom­bies are be­ings which can’t ob­serve any­thing. This seems patently ab­surd to me, and doesn’t seem like it’s what Chalmers meant at all when he came up with the thought ex­per­i­ment.

Like­wise, if we were to ask, “Is a bat con­scious?” then the an­swer would be a vac­u­ous “yes” un­der your view, since they have echolo­caters which take in ob­ser­va­tions and pro­cess in­for­ma­tion.

In this view even my com­puter is con­scious since it has a cam­era on it. For this rea­son, I sug­gest we are talk­ing about two differ­ent things.

• Mary’s room seems un­in­ter­est­ing, in that robot-Mary can pre­dict pretty well what bit-pat­tern she’s go­ing to get upon see­ing color. (To the ex­tent that the hu­man case is differ­ent, it’s be­cause of cog­ni­tive ar­chi­tec­ture con­straints)

Re­gard­ing the zom­bie ar­gu­ment: The robots have un­cer­tainty over the bridge laws. Un­der this un­cer­tainty, they may be­lieve it is pos­si­ble that hu­mans don’t have ex­pe­riences, due to the bridge laws only iden­ti­fy­ing sili­con brains as con­scious. Then hu­mans would be zom­bies. (They may have other the­o­ries say­ing this is pretty un­likely /​ log­i­cally in­co­her­ent /​ etc)

Ba­si­cally, the robots have a prim­i­tive en­tity “my ob­ser­va­tions” that they ex­plain us­ing their the­o­ries. They have to rec­on­cile this with the even­tual con­clu­sion they reach that their ob­ser­va­tions are those of a phys­i­cally in­stan­ti­ated mind like other minds, and they have de­grees of free­dom in which things they con­sider “ob­ser­va­tions” of the same type as “my ob­ser­va­tions” (things that could have been ob­served).

• As a qualia de­nier, I some­times feel like I side more with the Chalmers side of the ar­gu­ment, which at least ad­mits that there’s a strong in­tu­ition for con­scious­ness. It’s not that I think that the re­al­ist side is right, but it’s that I see the naive phys­i­cal­ists mak­ing state­ments that seem to com­pletely mis­in­ter­pret the re­al­ist’s ar­gu­ment.

I don’t mean to sin­gle you out in par­tic­u­lar. How­ever, you state that Mary’s room seems un­in­ter­est­ing be­cause Mary is able to pre­dict the “bit pat­tern” of color qualia. This seems to me to com­pletely miss the point. When you look at the sky and see blue, is it im­me­di­ately ap­pre­hen­si­ble as a sim­ple bit pat­tern? Or does it at least seem to have qual­i­ta­tive prop­er­ties too?

I’m not sure how to im­port my ar­gu­ment onto your brain with­out you at least see­ing this in­tu­ition, which is some­thing I con­sid­ered ob­vi­ous for many years.

• There is a qual­i­ta­tive red­ness to red. I get that in­tu­ition.

I think “Mary’s room is un­in­ter­est­ing” is wrong; it’s un­in­ter­est­ing in the case of robot sci­en­tists, but in­ter­est­ing in the case of hu­mans, in part be­cause of what it re­veals about hu­man cog­ni­tive ar­chi­tec­ture.

I think in the hu­man case, I would see Mary see­ing a red ap­ple as gain­ing in ex­pres­sive vo­cab­u­lary rather than in­for­ma­tion. She can then de­scribe fu­ture things as “like what I saw when I saw that first red ap­ple”. But, in the case of first see­ing the ap­ple, the red­ness quale is es­sen­tially an ar­bi­trary gen­sym.

I sup­pose I might end up agree­ing with the illu­sion­ist view on some as­pects of color per­cep­tion, then, in that I pre­dict color quales might feel like new in­for­ma­tion when they ac­tu­ally aren’t. Thanks for ex­plain­ing.

• I pre­dict color quales might feel like new in­for­ma­tion when they ac­tu­ally aren’t.

I am cu­ri­ous if you dis­agree with the claim that (hu­man) Mary is gain­ing im­plicit in­for­ma­tion, in that (de­spite already know­ing many facts about red-ness), her (hu­man) op­tic sys­tem wouldn’t have suc­cess­fully been able to pre­dict the in­com­ing vi­sual data from the ap­ple be­fore see­ing it, but af­ter­wards can?

• That does seem right, ac­tu­ally.

Now that I think about it, due to this cog­ni­tive ar­chi­tec­ture is­sue, she ac­tu­ally does gain new in­for­ma­tion. If she sees a red ap­ple in the fu­ture, she can know that it’s red (be­cause it pro­duces the same qualia as the first red ap­ple), whereas she might be con­fused about the color if she hadn’t seen the first ap­ple.

I think I got con­fused be­cause, while she does learn some­thing upon see­ing the first red ap­ple, it isn’t the naive “red wave­lengths are red-quale”, it’s more like “the neu­rons that de­tect red wave­lengths got wired and as­so­ci­ated with the ab­stract con­cept of red wave­lengths.” Which is still, effec­tively, new in­for­ma­tion to Mary-the-cog­ni­tive-sys­tem, given limi­ta­tions in hu­man men­tal ar­chi­tec­ture.

• A physi­cist might dis­cover that you can make com­put­ers out of mat­ter. You can make such com­put­ers pro­duce sounds. In pro­cess­ing sounds “homonym” is a perfectly legi­mate and use­ful con­cept. Even if two words are stored in far away hard­ware lo­ca­tions know­ing that they will “sound de­tec­tion clash” is im­por­tant in­for­ma­tion. Even if you slice it a lit­tle differ­ently and use differ­ent kinds of com­puter ar­chitech­tures it woudl still be a real phe­nomenon.

In tech­ni­cal terms there might be the is­sue whether its mean­ingful to differn­ti­ate be­tween founded con­cepts and hy­poth­e­sis. If hy­pothe­ses are re­quired then you could have a physi­cist that didn’t ever talk about tem­per­a­ture.

• It seems to me that you are try­ing to re­cover the prop­er­ties of con­scious ex­pe­rience in a way that can be re­duced to physics. Ul­ti­mately, I just feel that this ap­proach is not likely to suc­ceed with­out rad­i­cal re­vi­sions to what you con­sider to be con­scious ex­pe­rience. :)

Gen­er­ally speak­ing, I agree with the du­al­ists who ar­gue that physics is in­com­pat­i­ble with the claimed prop­er­ties of qualia. Un­like the du­al­ists, I see this as a strike against qualia rather than a strike against physics. David Chalmers does a great job in his ar­ti­cles out­lin­ing why con­scious prop­er­ties don’t fit nicely in our nor­mal phys­i­cal mod­els.

It’s not sim­ply that we are await­ing more data to fill in the de­tails: it’s that there seems to be no way even in prin­ci­ple to in­cor­po­rate con­scious ex­pe­rience into physics. Physics is just a differ­ent type of beast: it has no men­tal core, it is en­tirely made up of math­e­mat­i­cal re­la­tions, and is com­pletely global. Con­scious­ness as it’s de­scribed seems en­tirely in­ex­pli­ca­ble in that re­spect, and I don’t see how it could pos­si­bly su­per­vene on the phys­i­cal.

One could imag­ine a hy­po­thet­i­cal heaven-be­liever (some­one who claimed to have gone to heaven and back) list­ing pos­si­ble ways to in­cor­po­rate their ex­pe­rience into physics. They could say,

Hard-to-eff, as it’s not clear how physics in­ter­acts with the heav­enly realm. We must do more work to find out where the en­try points of heaven and earth are.
In prac­tice pri­vate due to the fact that tech­nol­ogy hasn’t been de­vel­oped yet that can al­low me to send mes­sages back from heaven while I’m there.
Pretty di­rectly ap­pre­hen­si­ble be­cause how would it even be pos­si­ble for me to have ex­pe­rienced that with­out heaven liter­ally be­ing real!

On the other hand, a skep­tic could re­ply that:

Even if mind read­ing tech­nol­ogy isn’t good enough yet, our best mod­els say that hu­mans can be de­scribed as com­pli­cated com­put­ers with a par­tic­u­lar neu­ral net­work ar­chi­tec­ture. And we know that com­put­ers can have bugs in them caus­ing them to say things when there is no log­i­cal jus­tifi­ca­tion.

Also, we know that com­put­ers can lack perfect in­tro­spec­tion so we know that even if it is ut­terly con­vinced that heaven is real, this could just be due to the fact that the com­puter is fol­low­ing its pro­gram­ming and is ex­cep­tion­ally stub­born.

Heaven has no clear in­ter­pre­ta­tion in our phys­i­cal mod­els. Yes, we could see that a su­per­ve­nience is pos­si­ble. But why rely on that hope? Isn’t it bet­ter to say that the be­lief is caused by some sort of in­ter­nal illu­sion? The lat­ter hy­poth­e­sis is at least ex­pli­ca­ble within our mod­els and doesn’t re­quire us to make new fun­da­men­tal philo­soph­i­cal ad­vances.

• It seems that doubt­ing that we have ob­ser­va­tions would cause us to doubt physics, wouldn’t it? Since physics-the-dis­ci­pline is about mak­ing, record­ing, com­mu­ni­cat­ing, and ex­plain­ing ob­ser­va­tions.

Why think we’re in a phys­i­cal world if our ob­ser­va­tions that seem to sug­gest we are are illu­sory?

This is kind of like if the peo­ple say­ing we live in a ma­te­rial world ar­rived at these the­o­ries through their heaven-rev­e­la­tions, and can only ex­plain the epistemic jus­tifi­ca­tion for be­lief in a ma­te­rial world by posit­ing heaven. Seems odd to think heaven doesn’t ex­ist in this cir­cum­stance.

(Note, per­son­ally I lean to­wards su­per­ve­nient neu­tral monism: di­rect ob­ser­va­tion and phys­i­cal the­o­riz­ing are differ­ent modal­ities for in­ter­act­ing with the same sub­stance, and men­tal prop­er­ties su­per­vene on phys­i­cal ones in a cur­rently-un­known way. Physics doesn’t rule out ob­ser­va­tion, in fact it de­pends on it, while it­self be­ing a limited modal­ity, such that it is un­sur­pris­ing if you couldn’t get all modal­ities through the phys­i­cal-the­o­riz­ing modal­ity. This view seems non-con­tra­dic­tory, though in­com­plete.)

• You seem to have similar char­ac­ter­is­tic in your be­liefs I en­coun­tered on less wrong be­fore.

https://​​www.less­wrong.com/​​posts/​​TniCuWCDxQe­qFSxut/​​ar­gu­ments-for-the-ex­is­tence-of-qualia-1?com­men­tId=Zwyh8Xt5uaZ4ZBYbP

There is the phe­nomenon of qualia and then there is the on­tolog­i­cal ex­ten­sion. The word does not re­fer to the on­tolog­i­cal ex­ten­sion.

It would be like ex­plain­ing light­ning with light­ning. Sure when we dig down there are non-light­ning parts. But light­ning still zaps peo­ple.

Or it would be a cat­e­gory er­ror like say­ing that if you can ex­plain physics with­out co­or­di­nates by only posit­ing that en­ergy ex­ists you should drop co­or­di­nates from your con­cepts. But co­or­di­nates are not a thing to be­lieve in, it’s a con­cep­tual tool to spec­ify claims not a hy­poth­e­sis in it­self. When ph­ysists be­lieve in a par­tic­u­lar field the­ory they are not agree­ing with the greek philo­sphers that think that the world is made of a type of num­ber.

• There is the phe­nomenon of qualia and then there is the on­tolog­i­cal ex­ten­sion. The word does not re­fer to the on­tolog­i­cal ex­ten­sion.

My ba­sic claim is that the way that peo­ple use the word qualia im­plic­itly im­plies the on­tolog­i­cal ex­ten­sions. By us­ing the term, you are ei­ther smug­gling these ex­ten­sions in, or you are us­ing the term in a way that no philoso­pher uses it. Here are some in­tu­itions:

Qualia are pri­vate en­tities which oc­cur to us and can’t be in­spected via third per­son sci­ence.

Qualia are in­ef­fable; you can’t ex­plain them us­ing a suffi­ciently com­plex English or math­e­mat­i­cal sen­tence.

Qualia are in­trin­stic; you can’t con­struct a quale if you had the right set of par­ti­cles.

etc.

Now, that’s not to say that you can’t define qualia in such a way that these on­tolog­i­cal ex­ten­sions are avoided. But why do so? If you are sim­ply re-defin­ing the phe­nomenon, then you have not ex­plained any­thing. The in­tu­itions above still re­main, and there is some­thing still un­ex­plained: namely, why peo­ple think that there are en­tities with the above prop­er­ties.

That’s why I think that in­stead, the illu­sion­ist ap­proach is the cor­rect one. Let me quote Keith Frank­ish, who I think does a good job ex­plain­ing this point of view,

Sup­pose we en­counter some­thing that seems anoma­lous, in the sense of be­ing rad­i­cally in­ex­pli­ca­ble within our es­tab­lished sci­en­tific wor­ld­view. Psy­choki­ne­sis is an ex­am­ple. We would have, broadly speak­ing, three op­tions.
First, we could ac­cept that the phe­nomenon is real and ex­plore the im­pli­ca­tions of its ex­is­tence, propos­ing ma­jor re­vi­sions or ex­ten­sions to our sci­ence, per­haps amount­ing to a paradigm shift. In the case of psy­choki­ne­sis, we might posit pre­vi­ously un­known psy­chic forces and em­bark on a ma­jor re­vi­sion of physics to ac­com­mo­date them.
Se­cond, we could ar­gue that, al­though the phe­nomenon is real, it is not in fact anoma­lous and can be ex­plained within cur­rent sci­ence. Thus, we would ac­cept that peo­ple re­ally can move things with their un­aided minds but ar­gue that this abil­ity de­pends on known forces, such as elec­tro­mag­netism.
Third, we could ar­gue that the phe­nomenon is illu­sory and set about in­ves­ti­gat­ing how the illu­sion is pro­duced. Thus, we might ar­gue that peo­ple who seem to have psy­choki­netic pow­ers are em­ploy­ing some trick to make it seem as if they are men­tally in­fluenc­ing ob­jects.

In the case of light­ning, I think that the first ap­proach would be cor­rect, since light­ning forms a valid phys­i­cal cat­e­gory un­der which we can cast our sci­en­tific pre­dic­tions of the world. In the case of the or­bit of Uranus, the sec­ond ap­proach is cor­rect, since it was ad­e­quately ex­plained by ap­peal­ing to un­der­stood New­to­nian physics. How­ever, the third ap­proach is most apt for bizarre phe­nom­ena that seem at first glance to be en­tirely in­com­pat­i­ble with our physics. And qualia cer­tainly fit the bill in that re­spect.

• When I say “qualia” I mean in­di­vi­d­ual in­stances of sub­jec­tive, con­scious ex­pe­rience full stop. Th­ese three ex­ten­sions are not what I mean when I say “qualia”.

Qualia are pri­vate en­tities which oc­cur to us and can’t be in­spected via third per­son sci­ence.

Not con­vinced of this. There are known neu­ral cor­re­lates of con­scious­ness. That our cur­rent brain scan­ners lack the re­quired re­s­olu­tion to make them in­spectable does not prove that they are not in­spectable in prin­ci­ple.

Qualia are in­ef­fable; you can’t ex­plain them us­ing a suffi­ciently com­plex English or math­e­mat­i­cal sen­tence.

This seems to be a limi­ta­tion of hu­man lan­guage band­width/​imag­i­na­tion, but not fun­da­men­tal to what qualia are. Con­sider the case of the con­joined twins Krista and Ta­ti­ana, who share some brain struc­ture and seem to be able “hear” each other’s thoughts and see through each other’s eyes.

Sup­pose we set up a thought ex­per­i­ment. Sup­pose that they grow up in a room with­out color, like Mary’s room. Now knock out Krista and show Ta­ti­ana some­thing red. Re­move the red thing be­fore Krista wakes up. Wouldn’t Ta­ti­ana be able to com­mu­ni­cate the ex­pe­rience of red to her sister? That’s an ef­fable quale!

And if they can do it, then in prin­ci­ple, so could you, with a fu­ture brain-com­puter in­ter­face.

Really, com­mu­ni­cat­ing at all is a trans­fer of ex­pe­rience. We’re limited by com­mon ground, sure. We both have to be speak­ing the same lan­guage, and have to have enough ex­pe­rience to be able to imag­ine the other’s men­tal state.

Qualia are in­trin­stic; you can’t con­struct a quale if you had the right set of par­ti­cles.

Again, not con­vinced. Isn’t your brain made of par­ti­cles? I con­struct qualia all the time just by think­ing about it. (It’s called “imag­i­na­tion”.) I don’t see any rea­son in prin­ci­ple why this could not be done ex­ter­nally to the brain ei­ther.

• The Ta­ti­ana and krista ex­per­i­ment is quite in­ter­est­ing but stretches the con­cept of com­mu­ni­ca­tion to it’s limits. I am in­clined to say that hav­ing a shared part of your con­cious­ness is not com­mu­ni­ca­tion in the same way that shar­ing a house is not traf­fic. It does strike me that com­mu­ni­ca­tion in­volves di­rected con­struc­tion of thoughts and it’s easy to imag­ine that the scope of what this con­struc­tion is ca­pa­ble would be vastly smaller than what goes on in the brain in other pro­cesses. Ex­tend­ing the con­struc­tion to new types of thoughts might be a soft bor­der rather than a hard one. With enough ver­bal sen­tences it should be in prin­ci­ple to be able to re­con­struct an ac­tual graph­i­cal image, but even with overtly de­scrip­tive prose this level is not re­ally reached (I pre­sume) but re­mains within the realm of sen­tence-like data struc­tures.

In the ex­am­ple Ta­ti­ana di­rects the vi­sual cor­tex and Krista can just re­call the rep­re­sen­ta­tion later. But in a sin­gle con­ciouness brain noth­ing can be made “ready” but it must be as­sem­bled by the brain it­self from sen­sory in­puts. That is cog­ni­tive space prob­a­bly has small fun­nels and for sign­fi­cant ob­jects they can’t travel them as them­selfs but must be chopped off into pieces and re­assem­bled af­ter pass­ing the tube.

• Let’s ex­tend the thought ex­per­i­ment a bit. Sup­pose tech­nol­ogy is de­vel­oped to sep­a­rate the twins. They rely on their shared brain parts for vi­tal func­tions, so where we cut nerve con­nec­tions we re­place them with a ra­dio transceiver and elec­trode ar­ray in each twin.

Now they are com­mu­ni­cat­ing thoughts via a pros­the­sis. Is that not com­mu­ni­ca­tion?

Maybe you already know what it is like to be a hive mind with a shared con­scious­ness, be­cause you are one: cut­ting the cor­pus cal­lo­sum cre­ates a split-brained pa­tient that seems to have two differ­ent per­son­al­ities that don’t always agree with each other. Maybe there are some con­nec­tions left, but the band­width has been dras­ti­cally re­duced. And even within hemi­spheres, the brain seems to be com­posed of yet smaller mod­ules. Your mind is made of parts that com­mu­ni­cate with each other and share ex­pe­rience, and some of it is con­scious.

I think the line di­vid­ing in­di­vi­d­ual per­sons is a soft one. A suffi­ciently high-band­width com­mu­ni­ca­tion in­ter­face can blur that bound­ary, even to the point of fus­ing con­scious­ness like brain hemi­spheres. Shared con­scious­ness means shared qualia, even if that con­nec­tion is later sev­ered, you might still re­mem­ber what it was like to be the other per­son. And in that way, qualia could hy­po­thet­i­cally be com­mu­ni­cated be­tween in­di­vi­d­u­als, or even species.

• If you would copy my brain but make it twice as large that copy would be as “lonely” as I would be and this would re­main af­ter ar­bi­trary dou­blings. Sin­gle in­di­vi­d­u­als can be ex­tended in space with­out com­mu­ni­cat­ing with other in­di­vi­d­u­als.

The “ex­tended wire” thought ex­periement doesn’t spec­ify enough how that phys­i­cal com­mu­ni­ca­tion line is used. It’s plau­si­ble that there is no “ver­bal­iza­tion” pro­cess like there is an step to write an email if one re­places sonic com­mu­ni­ca­tion with ip-packet com­mu­ni­ca­tion. With huge rel­a­tive dis­tance would come speed of light de­lays, if one twin was on earth and an­other on the moon there would be a round trip la­tency of sec­onds which prob­a­bly would dis­tort how the com­bined brain works. (And I guess with dou­blign in size would need to come with pro­por­tionate slow­ing to have same func­tion).

I think there is a differ­ence be­tween a in­for­ma­tion sys­tem be­ing spa­tially ex­tended and hav­ing two in­for­ma­tion sys­tems in­ter­face with each other. Say that you have 2 routers or 10 routers on the same length of line. It makes sense to make a dis­tinc­tion that each routers func­tions “in­de­pen­dently” even if they have to be able to sug­gest each other enough that pack­ets flow throught. To the first router the world “down­line” seems very similar whether or not in­ter­me­di­ate routers ex­ist. I don’t count in­for­ma­tion sys­tem in­ter­nal pro­cess­ing as com­mu­ni­cat­ing thus I don’t count “think­ing” into com­mu­ni­cat­ing. Thus the 10 router ver­sion does more com­mu­ni­cat­ing than the 2 router ver­sion.

I think the “ver­bal­iza­tion” step does mean that even high­band­width con­nec­tion doesn’t au­to­mat­i­cally mean qualia shar­ing. I am think­ing of plug­ings that al­low pro­gram­ming lan­guages to share code. Even if there is a perfect 1-to-1 com­pat­i­bil­ity be­tween the ab­strac­tions of the lan­guages I think still each lan­guage only ever ma­nipu­lates their ver­sion of that rep­re­sen­ta­tion. Cross-us­ing with­out trans­la­tion would make it illdefined what would be cor­rect func­tion but if you do trans­la­tion then it loses the qual­ities of the origi­nat­ing pro­gram­ming lan­guage. A C sharp in­te­ger vari­able will never con­tain a haskel in­te­ger even if a C sharp in­te­ger is con­structed to rep­re­sent the haskel in­te­ger. (I guess it would be pos­si­ble to make a su­per-lan­guage that has in­te­ger vari­ables that can con­tain haskel-in­te­gers and C-sharp in­te­gers but that lan­guage would not be C sharp or haskel). By be­ing a spe­sific kind of cog­ni­tive ar­chitech­ture you are locked into cer­tain rep­re­sen­ta­tion types which are un­escaable out­side of turn­ing into an­other kind ot ar­chitech­ture.

• I am as­sum­ing that the twins com­mu­ni­cat­ing thoughts re­quires an act of will like speak­ing does. I do have rea­sons for this. Watch­ing their faces when they com­mu­ni­cate thoughts makes it seem vol­un­tary.

But most of what you are do­ing when speak­ing is already sub­con­scious: One can “un­der­stand” the rules of gram­mar well enough to form cor­rect sen­tences on nearly all at­tempts, and yet be un­able to ex­plain the rules to a com­puter pro­gram (or to a child or ESL stu­dent). There is an el­e­ment of will, but it’s only an el­e­ment.

It may be the case that even with a high-band­width di­rect-brain in­ter­face it would take a lot of time and prac­tice to un­der­stand an­other’s thoughts. Hu­mans have a com­mon cog­ni­tive ar­chi­tec­ture by virtue of shared genes, but most of our in­di­vi­d­ual con­nec­tomes are ran­dom­ized and shaped by in­di­vi­d­ual ex­pe­rience. Our in­ter­nal rep­re­sen­ta­tions may thus be highly idiosyn­cratic, mean­ing a di­rect in­ter­face would be ad-hoc and only work on one per­son. How true this is, I can only spec­u­late with­out more data.

In your pro­gram­ming lan­guage anal­ogy, these data types are only ab­strac­tions built on top of a more fun­da­men­tal CPU ar­chi­tec­ture where the only data types are bytes. Maybe an im­ple­men­ta­tion of C# could be made that uses ex­actly the same bit pat­tern for an int as Haskell does. Hu­man neu­rons work pretty much the same way across in­di­vi­d­u­als, and even cor­ti­cal columns seem to use the same ar­chi­tec­ture.

I don’t think the in­abil­ity to com­mu­ni­cate qualia is pri­mar­ily due to the limi­ta­tion of lan­guage, but due to the limi­ta­tion of imag­i­na­tion. I can ex­plain what a tesser­act is, but that doesn’t mean you can vi­su­al­ize it. I could give you analo­gies with lower di­men­sions. Maybe you could un­der­stand well enough to make a men­tal model that gives you good pre­dic­tions, but you still can’t vi­su­al­ize it. Similarly, I could ex­plain what it’s like to be a tetra­chro­mat, how sep­tarine and oc­tarine are col­ors dis­tinct from the oth­ers, and maybe you can de­velop a model good enough to make good pre­dic­tions about how it would work, but again you can’t vi­su­al­ize these col­ors. This failing is not on English.

• Sure the differ­ence be­tween hear­ing about a tesser­act and be­ing able to vi­su­al­ise it is sig­nifi­cant but I think the differ­ence might not be an im­pos­si­bil­ity bar­rier but just skill level of imag­i­na­tion.

Hav­ing learned some echolo­ca­tion my qualia in­volved in hear­ing have changed and it makes it seem pos­si­ble to be able to make a similar tran­si­tion from a trichro­mat vi­sual space into a tetra­chro­mat vi­sual space. The weird thing about it is that my ear re­ceives as much in­for­ma­tion that it did be­fore but I just pay at­ten­tion to it differ­ently. Hav­ing defi­cient un­der­stand­ing in the sense of get­ting things wrong is easy line to draw. But it seems at some point the un­der­stand­ing be­comes vivid in­stead of the­o­reth­i­cal.

• Qualia are in­trin­stic; you can’t con­struct a quale if you had the right set of par­ti­cles.

I’m pretty sure that’s not what “in­trinisc” is sup­posed to mean. From “The Qual­ities of Qualia” by David de Leon.

Within philos­o­phy there is a dis­tinc­tion, albeit a con­tentious one, be­tween in­trin­sic and ex­trin­sic prop­er­ties. Roughly speak­ing “ex­trin­sic” seems to be syn­ony­mous with “re­la­tional.” The prop­erty of be­ing an un­cle, for ex­am­ple, is a prop­erty which de­pends on (and con­sists of) a re­la­tion to some­thing else, namely a niece or a nephew. In­trin­sic prop­er­ties, then, are those which do not de­pend on this kind of re­la­tion. That qualia are in­trin­sic means that their qual­i­ta­tive char­ac­ter can be iso­lated from ev­ery­thing else go­ing on in the brain (or el­se­where) and is not de­pen­dent on re­la­tions to other men­tal states, be­havi­our or what have you. The idea of the in­de­pen­dence of qualia on any such re­la­tion may well stem from the con­ceiv­abil­ity of in­verted qualia: we can imag­ine two phys­i­cally iden­ti­cal brains hav­ing differ­ent qualia, or even that qualia are ab­sent from one but not the other.

• I find it im­por­tant in philos­o­phy to be on the clear what you mean. It is one thing to ex­plain and an­other to define what you mean. You might point to a yel­low ob­ject and say yel­low and some­body that mi­s­un­der­stood might think that you mean “round­ness” by yel­low. The ac­cu­racy is most im­por­tant when the views are rad­i­cal and talk in very differ­ent wor­lds. And “dis­prov­ing” yel­low by not be­ing able to pick it out from os­ten­sive differenta­tion is not an ar­gu­men­ta­tive vic­tory but a com­mu­nica­tive failure.

Even if we use some other term I think that mean­ing is im­por­tant to have. “Pl­o­gis­ton” might sneak in claims but that is just the more rea­son to have terms that have as lit­tle room for smug­gling as pos­si­ble. And we still need good terms to talk about burn­ing. “oxy­gen” liter­ally means “black maker” but we nowa­days un­der­stand it as a term to re­fer to a el­e­ment which has defi­ni­tion­ally very lit­tle to do with the color black.

I think the start­ing point that gen­er­ated the word refers to a gen­uine prob­lem. Hav­ing qualia in cat­e­gory three would mean that you claim that I do not have ex­pe­riences. And if qualia is a bad loaded word to re­fer to the thing to be ex­plained it would be good to make up a new term that refers to that. But to me qualia was just that word. I word like “dark mat­ter” might ex­pe­rience similar “high­jack pres­sure” by hav­ing wild claims thrown around about it. And there hav­ing things like “warm dark mat­ter”, “wimpy dark mat­ter” makes the clas­sifi­ca­tion more fine mak­ing the con­cep­tual anal­y­sis pro­ceed. But re­quire­ments of clear think­ing are differ­ent from tra­di­tion preser­vance. If you say that “warm dark mat­ter” can’t be the an­swer the ques­tion of dark mat­ter still stands. Even if you suc­ces­fully ar­gue that “qualia” can’t be a at­trac­tive con­cept the is­sue of me not be­ing a p-zom­bie still re­mains and it would be ex­pected that some the­o­reth­i­cal bend­ing over back­wards would hap­pen.

• If we are able to ex­plain why you be­lieve in, and talk about qualia with­out refer­ring to qualia what­so­ever in our ex­pla­na­tion, then we should re­ject the ex­is­tence of qualia as a hypothesis

That ar­gu­ment has an in­verse: “If we are able to ex­plain why you be­lieve in, and talk about an ex­ter­nal with­out refer­ring to an ex­ter­nal world what­so­ever in our ex­pla­na­tion, then we should re­ject the ex­is­tence of an ex­ter­nal world as a hy­poth­e­sis”.

Peo­ple want re­duc­tive ex­pla­na­tion to be uni­di­rec­tional,so that you have an A and a B, and clearly it is the B which is re­dun­dant and can be re­placed with A. But not all ex­pla­na­tions work in that con­ve­nient way...some­times A and B are mu­tu­ally re­dun­dant, in the sense that you don’t need both.

The moral of the story be­ing to look for the over­all best ex­pla­na­tion, not just elimi­nate re­dun­dancy.

• It’s a strong ar­gu­ment, but there are strong ar­gu­ments on the other side as well.

• “Im­mor­tal­ity is cool and all, but our uni­verse is go­ing to run down from en­tropy even­tu­ally”

I con­sider this ar­gu­ment wrong for two rea­sons. The first is the ob­vi­ous rea­son, which is that even if im­mor­tal­ity is im­pos­si­ble, it’s still bet­ter to live for a long time.

The sec­ond rea­son why I think this ar­gu­ment is wrong is be­cause I’m cur­rently con­vinced that literal phys­i­cal im­mor­tal­ity is pos­si­ble in our uni­verse. Usu­ally when I say this out loud I get an au­dible “what” or some­thing to that effect, but I’m not kid­ding.

It’s go­ing to be hard to ex­plain my in­tu­itions for why I think real im­mor­tal­ity is pos­si­ble, so bear with me. First, this is what I’m not say­ing:

• I’m not say­ing that we can out­last the heat death of the uni­verse somehow

• I’m not say­ing that we just need to shift our con­cep­tion of im­mor­tal­ity to be some­thing like, “We live in the hearts of our coun­try­men” or any­thing like that.

• I’m not say­ing that I have a spe­cific plan for how to be­come im­mor­tal per­son­ally, and

• I’m not say­ing that my pro­posal has no flaws what­so­ever and that this is a valid line of re­search to be con­duct­ing at the mo­ment.

So what am I say­ing?

A typ­i­cal model of our life as hu­mans is that we are some­thing like a worm in 4 di­men­sional space. On one side of the worm there’s our birth, and on the other side of the worm is our un­timely death. We ‘live through’ this worm, and that is our life. The length of our life is mea­sured by con­sid­er­ing the length of the worm in 4 di­men­sional space, mea­sured just like a yard­stick.

Now just change the per­spec­tive a lit­tle bit. If we could some­how aban­don our cur­rent way of liv­ing, then maybe we can al­ter the ge­om­e­try of this worm so that we are im­mor­tal. Con­sider: a cir­cle has no start­ing point and no end. If some­one could some­how ‘live through’ a cir­cle, then their life would con­sist of an eter­nal loop through ex­pe­riences, re­peat­ing end­lessly.

The idea is that we some­how con­struct a phys­i­cal man­i­fes­ta­tion of this im­mor­tal­ity cir­cle. I think of it like an ac­tual loop in 4 di­men­sional space be­cause it’s difficult to vi­su­al­ize with­out an anal­ogy. A su­per­in­tel­li­gence could per­haps pre­dict what type of ac­tions would be nec­es­sary to con­struct this im­mor­tal loop. And once it is con­structed, it’ll be there for­ever.

From an out­side view in our 3d mind’s eye, the con­struc­tion of this loop would look very strange. It could look like some­thing pop­ping into ex­is­tence sud­denly and get­ting larger, and then sud­denly pop­ping out of ex­is­tence. I don’t re­ally know; that’s just the in­tu­ition.

What mat­ters is that within this loop some­one will be liv­ing their life on re­peat. True Déjà vu. Each mo­ment they live is in their fu­ture, and in their past. There are no new ex­pe­riences and no nov­elty, but the su­per­in­tel­li­gence can con­struct it so that this part is not un­en­joy­able. There would be no right an­swer to the ques­tion “how old are you.” And in my view, it is perfectly valid to say that this per­son is truly, ac­tu­ally im­mor­tal.

Per­haps some­one who val­ued im­mor­tal­ity would want one of these loops to be con­structed for them­selves. Per­haps for some rea­son con­struct­ing one of these things is im­pos­si­ble in our uni­verse (though I sus­pect that it’s not). There are an­thropic rea­sons that I have con­sid­ered for why con­struct­ing it might not be worth it… but that would be too much to go into for this short­form post.

To close, I cur­rently see no knock­down rea­sons to be­lieve that this sort of scheme is im­pos­si­ble.

• I don’t know of physics rules rul­ing this out. How­ever, I sus­pect this doesn’t re­solve the prob­lems that the peo­ple I know who care most about im­mor­tal­ity are wor­ried about. (I’m not sure – I haven’t heard them ex­press clear prefer­ences about what ex­actly they pre­fer on the billions/​trillions year timescale. But they seem more con­cerned run­ning out of abil­ity to have new ex­pe­riences than not-want­ing-to-die-in-par­tic­u­lar.)

My im­pres­sion is many of the peo­ple who care about this sort of thing also tend to think that if you have mul­ti­ple in­stances of the ex­act same thing, it just counts as a sin­gle in­stance. (Or, some­thing more com­pli­cated about many wor­lds and in­creas­ing your mea­sure)

• I agree with the ob­jec­tion. :) Per­son­ally I’m not sure whether I’d want to be stuck in a loop of ex­pe­riences re­peat­ing over and over for­ever.

How­ever, even if we con­sid­ered “true” im­mor­tal­ity, re­peat ex­pe­riences are in­evitable sim­ply be­cause there’s a finite num­ber of pos­si­ble ex­pe­riences. So, we’d have to start re­peat­ing things even­tu­ally.

• In one scene in Egan’s Per­mu­ta­tion City, the Peer char­ac­ter ex­pe­rienced “in­finity” when he set him­self up in an in­finite loop such that his later ex­pe­rience matched up perfectly with the start of the loop (walk­ing down the side of an in­finitely tall build­ing, if I re­call). But he also ex­pe­rienced the loop end­ing.

• Vir­tual par­ti­cles “pop into ex­is­tence” in mat­ter/​an­ti­mat­ter pairs and then “pop out” as they an­nihilate each other all the time. In one in­ter­pre­ta­tion, an elec­tron positron pair (for ex­am­ple) can be thought of as one elec­tron that loops around and goes back in time. Due to CPS sym­me­try, this back­ward path looks like a positron. https://​​www.youtube.com/​​watch?v=9dqtW9MslFk

• It sounds like you’re talk­ing about time travel. Th­ese “worms” are called “wor­ldlines”. Space­time is not sim­ply R^4. You can ro­tate in the fourth di­men­sion—this is just ac­cel­er­a­tion. But you can’t ac­cel­er­ate enough to turn around and bite your own tail be­cause ro­ta­tions in the fourth di­men­sion are hy­per­bolic rather than cir­cu­lar. You can’t ex­ceed or even reach light speed. There are solu­tions to Gen­eral Rel­a­tivity that con­tain closed timelike curves, but it’s not clear if they cor­re­spond to any­thing phys­i­cally re­al­iz­able.

• I have a pre­vi­ous high im­pli­ci­a­tion un­cer­tainty about this (that would be a crux?). ” you can’t ac­cel­er­ate enough to turn around ” seems false to me. The math­e­mat­i­cal ro­ta­tion seems like it ought to ex­ist. The pre­voius rea­sons I thought such a math­e­mat­i­cal ro­ta­tion would be im­pos­si­ble I have sign­fi­cantly less faith in. If I draw a unit sphere ana­log in space­time hav­ing a vi­sual ob­ser­va­tion from the space-time di­a­gram drawn on eu­clid pa­per is not suffi­cient to con­clude that the fu­ture cone is far from past cone. And think­ing that a sphere is “all within r dis­tance” it would seem it should be con­tin­u­ous and sim­ply con­nected un­der most in­stances. I think there also should ex­ist a trans­for­ma­tion that when re­peated enough times re­turns to the origi­nal con­figu­ra­tion. And I find it sur­pris­ing that a boost like trans­for­ma­tion would fail to be like that if it is a ro­ta­tion ana­log.

I have started to be­lieve that the stan­drd rea­son­ing why you can’t go faster than light re­lies on a kind of faulty logic. With nor­mal eu­clidean ge­om­e­try it would go like: there is a max­i­mum an­gle you can reach by in­creas­ing the y-co­or­di­nate and slope is just the ra­tio of x to y so at that max­i­mum y max­i­mum slope is reached so max­i­mum an­gle that you can have is 90 de­grees. So if you try to go at 100 de­grees you have lesser y and are ac­tu­ally go­ing slower. And in a way 90 de­grees is kind of the max­i­mum amount you can point in an­other di­rec­tion. But nor­mally de­grees go up to 180 or 360 de­grees.

In the rel­a­tivity side c is the max­i­mum ra­tio but that is for co­or­di­nate time. If some­bodys proper time would start point­ing in a di­rec­tion that would pro­ject nega­tively on the co­or­di­nate time axis the com­par­i­son be­tween x per co­or­di­nate time and x per proper time would be­come sig­nifi­cant.

There is also a tra­jec­tory which seems to be timelike in all seg­ments. A=(0,0,0,0),(2,1,0,0),B=(4,2,0,0),(2,3,0,0),C=(0,4,0,0),(2,5,0,0),D=(4,6,0,0). It would seem awfully a lot like the “cor­ner” A B C would be of equal mag­ni­tude but op­po­site sign from B C D. Now I get why physcially such a tra­jec­tory would be challeng­ing. But from a math­e­mat­i­cal point of view it is hard to un­der­stand why it would be ill-defined. It would also be very strange if there is no boost you can make at B to go from di­rec­tion AB to di­rec­tion BC. I get why you can’t ro­tate from AB to BD (can’t ro­tate a timelike dis­tance to spacelike dis­tance if ro­ta­tion pre­serves length).

I also kind of get why yo woudl need in­fn­inte en­ergy make such “im­pos­si­bly sharp” turns. But as en­ergy is the con­served charge of time trans­la­tion, the defi­ni­tion of time might de­pend on which time you choose to de­rive it from. If you were to gain en­ergy from an ex­ter­nal source it would have to be tachyon or go­ing back­wards in time (which are ei­ther im­pos­si­ble or hard to pro­duce). But if you had a thruster with you with fuel the “proper time en­ergy” might be­have differ­ently. That is if you are go­ing at sign­fi­cant C and the whole uni­verse is frozen and whiss­ing by you should still be able to fire your rock­ets ac­cord­ing to your time (1 sec­ond of your en­g­ines might take the en­tire age of the uni­verse to ex­ter­nal ob­servers but does that pre­vent things hap­pen­ing from your per­spec­tive?). If ac­cel­er­a­tion “turns your time di­rec­tion” and not “in­creases dis­place­ment per spent sec­ond” at some finite amount of ac­cel­er­a­tion ex­pe­rienced you would come full cir­cle or atleast long enough that you are now go­ing to the nega­tive di­rec­tion that you started in.

• I agree I would not be able to ac­tu­ally ac­com­plish time travel. The point is whether we could con­struct some ob­ject in Minkowski space (or what­ever Gen­eral Rel­a­tivity uses, I’m not a physi­cist) that we con­sid­ered to be loop-like. I don’t think it’s worth my time to figure out whether this is re­ally pos­si­ble, but I sus­pect that some­thing like it may be.

Edit: I want to say that I do not have an in­tu­ition for physics or space­time at all. My main rea­son for think­ing this is pos­si­ble is mainly that I think my idea is fairly min­i­mal: I think you might be able to do this even in R^3.

• I have dis­cov­ered re­cently that while I am gen­er­ally tired and groggy in the morn­ing, I am well rested and happy af­ter a nap. I am un­sure if this matches other peo­ple’s ex­pe­riences, and haven’t ex­plored much re­search. Still, I think this is in­ter­est­ing to think about fully.

What is the best way to ap­ply this knowl­edge? I am con­sid­er­ing pur­posely sab­o­tag­ing my sleep so that I am tired enough to take a nap by noon, which would re­fresh me for the en­tire day. But this plan may have some sig­nifi­cant draw­backs, in­clud­ing be­ing ex­ces­sively tired for a few hours in the morn­ing.

• I’m as­sum­ing from con­text you’re uni­ver­sally groggy in the morn­ing no mat­ter how much sleep you get? (i.e. you’ve tried the ob­vi­ous thing of just ‘sleep more’?)

• Pretty much, yes. Even with 10+ hours of sleep I am not as re­freshed as a nap. It’s weird, but I think it’s a real effect.

• Two easy things you can try to feel less groggy in the morn­ing are:

• Drink­ing a full glass of wa­ter as soon as you wake up.

• Listen­ing to mu­sic or a pod­cast (blue­tooth ear­phones work great here!). Mu­sic does the trick for me, al­though I’m usu­ally not in the mood and I pre­fer a pod­cast.

About tak­ing naps, while it seems to work for some peo­ple, I’m gen­er­ally against it since it usu­ally im­pairs my cir­ca­dian clock greatly (I can­not keep con­sis­tent times and med­dles with my sched­ule too much).

At nights, I take mela­tonin and it seems to have been of great help to keep con­sis­tent times at which I go to sleep (tak­ing it with L-Thea­nine seems to be bet­ter for me some­how). Be­sides that, I do pay a lot of at­ten­tion to other zeit­ge­bers such as ex­er­cise, eat­ing be­hav­ior, light ex­po­sure, and coffee. This is to say—reg­u­lat­ing your cir­ca­dian clock may be what you’re look­ing for.

A link of in­ter­est is gw­ern’s post about vi­tamin d ex­per­i­ment and other posts about sleep also.