# riceissa(Issa Rice)

Karma: 1,056
• I’m not sure ex­actly what you’re try­ing to learn here, or what de­bate you’re try­ing to re­solve. (Do you have a refer­ence?)

I’m not en­tirely sure what I’m try­ing to learn here (which is part of what I was try­ing to ex­press with the fi­nal para­graph of my ques­tion); this just seemed like a nat­u­ral ques­tion to ask as I started think­ing more about AI take­off.

In “I Heart CYC”, Robin Han­son writes: “So we need to ex­plic­itly code knowl­edge by hand un­til we have enough to build sys­tems effec­tive at ask­ing ques­tions, read­ing, and learn­ing for them­selves. Prior AI re­searchers were too com­fortable start­ing ev­ery pro­ject over from scratch; they needed to join to cre­ate larger in­te­grated knowl­edge bases.”

It sounds like he ex­pects early AGI sys­tems to have lots of hand-coded knowl­edge, i.e. the min­i­mum num­ber of bits needed to spec­ify a seed AI is large com­pared to what Eliezer Yud­kowsky ex­pects. (I wish peo­ple gave num­bers for this so it’s clear whether there re­ally is a dis­agree­ment.) It also sounds like Robin Han­son ex­pects progress in AI ca­pa­bil­ities to come from piling on more hand-coded con­tent.

If ML source code is small and isn’t grow­ing in size, that seems like ev­i­dence against Han­son’s view.

If ML source code is much smaller than the hu­man genome, I can do a bet­ter job of vi­su­al­iz­ing the kind of AI de­vel­op­ment tra­jec­tory that Robin Han­son ex­pects, where we stick in a bunch of con­tent and share con­tent among AI sys­tems. If ML source code is already quite large, then it’s harder for me to vi­su­al­ize this (in this case, it seems like we don’t know what we’re do­ing, and progress will come from bet­ter un­der­stand­ing).

If the hu­man genome is small, I think that makes a dis­con­ti­nu­ity in ca­pa­bil­ities more likely. When I try to vi­su­al­ize where progress comes from in this case, it seems like it would come from a small num­ber of in­sights. We can take some ex­treme cases: if we knew that the code for a seed AGI could fit in a 500-line Python pro­gram (I don’t know if any­body ex­pects this), a FOOM seems more likely (there’s just less sur­face area for mak­ing lots of small im­prove­ments). Whereas if I knew that the small­est pro­gram for a seed AGI re­quired gi­gabytes of source code, I feel like progress would come in smaller pieces.

If an al­gorithm uses data struc­tures that are speci­fi­cally suited to do­ing Task X, and a differ­ent set of data struc­tures that are suited to Task Y, would you call that two units of con­tent or two units of ar­chi­tec­ture?

I’m not sure. The con­tent/​ar­chi­tec­ture split doesn’t seem clean to me, and I haven’t seen any­one give a clear defi­ni­tion. Spe­cial­ized data struc­tures seems like a good ex­am­ple of some­thing that’s in be­tween.

# [Question] Source code size vs learned model size in ML and in hu­mans?

20 May 2020 8:47 UTC
11 points
• I’m con­fused about the trade­off you’re de­scribing. Why is the first bul­let point “Gen­er­at­ing bet­ter ground truth data”? It would make more sense to me if it said in­stead some­thing like “Gen­er­at­ing large amounts of non-ground-truth data”. In other words, the thing that am­plifi­ca­tion seems to be pro­vid­ing is ac­cess to more data (even if that data isn’t the ground truth that is pro­vided by the origi­nal hu­man).

Also in the sec­ond bul­let point, by “in­creas­ing the amount of data that you train on” I think you mean in­creas­ing the amount of data from the origi­nal hu­man (rather than data com­ing from the am­plified sys­tem), but I want to con­firm.

Aside from that, I think my main con­fu­sion now is ped­a­gog­i­cal (rather than tech­ni­cal). I don’t un­der­stand why the IDA post and pa­per don’t em­pha­size the effi­ciency of train­ing. The post even says “Re­source and time cost dur­ing train­ing is a more open ques­tion; I haven’t ex­plored the as­sump­tions that would have to hold for the IDA train­ing pro­cess to be prac­ti­cally fea­si­ble or re­source-com­pet­i­tive with other AI pro­jects” which makes it sound like the effi­ciency of train­ing isn’t im­por­tant.

• And I’ve seen Eliezer make the claim a few times. But I can’t find an ar­ti­cle de­scribing the idea. Does any­one have a link?

Eliezer talks about this in Do Earths with slower eco­nomic growth have a bet­ter chance at FAI? e.g.

Rel­a­tive to UFAI, FAI work seems like it would be math­ier and more in­sight-based, where UFAI can more eas­ily cob­ble to­gether lots of pieces. This means that UFAI par­allelizes bet­ter than FAI.

• The ad­di­tion of the dis­til­la­tion step is an ex­tra con­founder, but we hope that it doesn’t dis­tort any­thing too much—its pur­pose is to im­prove speed with­out af­fect­ing any­thing else (though in prac­tice it will re­duce ca­pa­bil­ities some­what).

I think this is the crux of my con­fu­sion, so I would ap­pre­ci­ate if you could elab­o­rate on this. (Every­thing else in your an­swer makes sense to me.) In Evans et al., dur­ing the dis­til­la­tion step, the model learns to solve the difficult tasks di­rectly by us­ing ex­am­ple solu­tions from the am­plifi­ca­tion step. But if can do that, then why can’t it also learn di­rectly from ex­am­ples pro­vided by the hu­man?

To use your anal­ogy, I have no doubt that a team of Ro­hins or a sin­gle Ro­hin think­ing for days can an­swer any ques­tion that I can (given a sin­gle day). But with dis­til­la­tion you’re say­ing there’s a robot that can learn to an­swer any ques­tion I can (given a sin­gle day) by first ob­serv­ing the team of Ro­hins for long enough. If the robot can do that, why can’t the robot also learn to do the same thing by ob­serv­ing me for long enough?

# [Question] How does iter­ated am­plifi­ca­tion ex­ceed hu­man abil­ities?

2 May 2020 23:44 UTC
21 points
• I want to high­light a po­ten­tial am­bi­guity, which is that “New­ton’s ap­prox­i­ma­tion” is some­times used to mean New­ton’s method for find­ing roots, but the “New­ton’s ap­prox­i­ma­tion” I had in mind is the one given in Tao’s Anal­y­sis I, Propo­si­tion 10.1.7, which is a way of restat­ing the defi­ni­tion of the deriva­tive. (Here is the state­ment in Tao’s notes in case you don’t have ac­cess to the book.)

• What is the plan go­ing for­ward for in­ter­views? Are you plan­ning to in­ter­view peo­ple who are more pes­simistic?

• In the first cat­e­go­riza­tion scheme, I’m also not ex­actly sure what nihilism is refer­ring to. Do you know? Is it just refer­ring to Er­ror The­ory (and maybe in­co­her­en­tism)?

Yes, Hue­mer writes: “Nihilism (a.k.a. ‘the er­ror the­ory’) holds that eval­u­a­tive state­ments are gen­er­ally false.”

Usu­ally non-cog­ni­tivism would fall within nihilism, no?

I’m not sure how the term “nihilism” is typ­i­cally used in philo­soph­i­cal writ­ing, but if we take nihilism=er­ror the­ory then it looks like non-cog­ni­tivism wouldn’t fall within nihilism (just like non-cog­ni­tivism doesn’t fall within er­ror the­ory in your flowchart).

I ac­tu­ally don’t think ei­ther of these di­a­grams place Nihilism cor­rectly.

For the first di­a­gram, Hue­mer writes “if we say ‘good’ pur­ports to re­fer to a prop­erty, some things have that prop­erty, and the prop­erty does not de­pend on ob­servers, then we have moral re­al­ism.” So for Hue­mer, nihilism fails the mid­dle con­di­tion, so is clas­sified as anti-re­al­ist. For the sec­ond di­a­gram, see the quote be­low about du­al­ism vs monism.

I’m not su­per well ac­quainted with the monism/​du­al­ism dis­tinc­tion, but in the com­mon con­cep­tion don’t they both gen­er­ally as­sume that moral­ity is real, at least in some semi-ro­bust sense?

Hue­mer writes:

Here, du­al­ism is the idea that there are two fun­da­men­tally differ­ent kinds of facts (or prop­er­ties) in the world: eval­u­a­tive facts (prop­er­ties) and non-eval­u­a­tive facts (prop­er­ties). Only the in­tu­ition­ists em­brace this.

Every­one else is a monist: they say there is only one fun­da­men­tal kind of fact in the world, and it is the non-eval­u­a­tive kind; there aren’t any value facts over and above the other facts. This im­plies that ei­ther there are no value facts at all (elimi­na­tivism), or value facts are en­tirely ex­pli­ca­ble in terms of non-eval­u­a­tive facts (re­duc­tion­ism).

• Michael Hue­mer gives two tax­onomies of metaeth­i­cal views in sec­tion 1.4 of his book Eth­i­cal In­tu­ition­ism:

As the pre­ced­ing sec­tion sug­gests, metaeth­i­cal the­o­ries are tra­di­tion­ally di­vided first into re­al­ist and anti-re­al­ist views, and then into two forms of re­al­ism and three forms of anti-re­al­ism:

           Nat­u­ral­ism
/​
Real­ism
/​       \
/​         In­tu­ition­ism
/​
\
\              Sub­jec­tivism
\            /​
Anti-Real­ism—Non-Cog­ni­tivism
\
Nihilism


This is not the most illu­mi­nat­ing way of clas­sify­ing po­si­tions. It im­plies that the most fun­da­men­tal di­vi­sion in metaethics is be­tween re­al­ists and anti-re­al­ists over the ques­tion of ob­jec­tivity. The dis­pute be­tween nat­u­ral­ism and in­tu­ition­ism is then seen as rel­a­tively minor, with the nat­u­ral­ists be­ing much closer to the in­tu­ition­ists than they are, say, to the sub­jec­tivists. That isn’t how I see things. As I see it, the most fun­da­men­tal di­vi­sion in metaethics is be­tween the in­tu­ition­ists, on the one hand, and ev­ery­one else, on the other. I would clas­sify the po­si­tions as fol­lows:

   Dual­ism—In­tu­ition­ism
/​
/​                      Sub­jec­tivism
/​                      /​
\          Re­duc­tion­ism
\        /​            \
\      /​              Nat­u­ral­ism
Mon­ism
\               Non-Cog­ni­tivism
\             /​
Elimi­na­tivism
\
Nihilism

• Do you have prior po­si­tions on re­la­tion­ships that you don’t want to get cor­rupted through the dat­ing pro­cess, or some­thing else?

I think that’s one way of putting it. I’m fine with my prior po­si­tions on re­la­tion­ships chang­ing be­cause of bet­ter in­tro­spec­tion (aided by dat­ing), but not fine with my prior po­si­tions chang­ing be­cause they are get­ting cor­rupted.

In­tel­li­gence be­yond your cone of tol­er­ance is usu­ally a trait that peo­ple pur­sue be­cause they think it’s “eth­i­cal”

I’m not sure I un­der­stand what you mean. Could you try re-stat­ing this in differ­ent words?

• A ques­tion about ro­man­tic re­la­tion­ships: Let’s say cur­rently I think that a girl needs to have a cer­tain level of smart­ness in or­der for me to date her long-term/​marry her. Sup­pose I then start dat­ing a girl and de­cide that ac­tu­ally, be­ing smart isn’t as im­por­tant as I thought be­cause the girl makes up for it in other ways (e.g. be­ing very pretty/​pleas­ant/​sub­mis­sive). I think this kind of change of mind is le­gi­t­i­mate in some cases (e.g. be­cause I got bet­ter at figur­ing out what I value in a woman) and ille­gi­t­i­mate in other cases (e.g. be­cause the girl I’m dat­ing man­aged to se­duce me and mess up my in­tro­spec­tion). My ques­tion is, is this dis­tinc­tion real, and if so, is there any way for me to tell which situ­a­tion I am in (le­gi­t­i­mate vs ille­gi­t­i­mate change of mind) once I’ve already be­gun dat­ing the girl?

This prob­lem arises be­cause I think dat­ing is im­por­tant for in­tro­spect­ing about what I want, i.e. there is a point af­ter which I can no longer ob­tain new in­for­ma­tion about my prefer­ences via think­ing alone. The prob­lem is that dat­ing is also po­ten­tially a val­ues-cor­rupt­ing pro­cess, i.e. dat­ing some­one who doesn’t meet cer­tain crite­ria I think I might have means that I can get trapped in a re­la­tion­ship.

I’m also cu­ri­ous to hear if peo­ple think this isn’t a big prob­lem (and if so, why).

• I have only a very vague idea of what you mean. Could you give an ex­am­ple of how one would do this?

# [Question] What are some ex­er­cises for build­ing/​gen­er­at­ing in­tu­itions about key dis­agree­ments in AI al­ign­ment?

16 Mar 2020 7:41 UTC
17 points