Thoughts on Ben Garfinkel’s “How sure are we about this AI stuff?”

I liked this talk by Ben.

I think it raises some very im­por­tant points. OTTMH, I think the most im­por­tant one is: We have no good crit­ics. There is no­body I’m aware of who is se­ri­ously in­vested in knock­ing down AI-Xrisk ar­gu­ments and qual­ified to do so. For many crit­ics in ma­chine learn­ing (like An­drew Ng and Yann Le­cun), the ar­gu­ments seem ob­vi­ously wrong or mis­guided, and so they do not think it’s worth their time to en­gage be­yond stat­ing that.

A re­lated point which is also im­por­tant is: We need to clar­ify and strengthen the case for AI-Xrisk. Per­son­ally, I think I have a very good in­ter­nal map of the path ar­gu­ments about AI-Xrisk can take, and the type of ob­jec­tions one en­coun­ters. It would be good to have this as some form of flow-chart. Let me know if you’re in­ter­ested in helping make one.

Re­gard­ing ma­chine learn­ing, I think he made some very good points about how the the way ML works doesn’t fit with the pa­per­clip story. I think it’s worth ex­plor­ing the dis­analo­gies more and see­ing how that af­fects var­i­ous Xrisk ar­gu­ments.

As I re­flect on what’s miss­ing from the con­ver­sa­tion, I always feel the need to make sure it hasn’t been cov­ered in Su­per­in­tel­li­gence. When I read it sev­eral years ago, I found Su­per­in­tel­li­gence to be re­mark­ably thor­ough. For ex­am­ple, I’d like to point out that FOOM isn’t nec­es­sary for a unilat­eral AI-takeover, since an AI could be pro­gress­ing grad­u­ally in a box, and then break out of the box already su­per­in­tel­li­gent; I don’t re­mem­ber if Bostrom dis­cussed that.

The point about jus­tifi­ca­tion drift is quite apt. For in­stance, I think the case for MIRI’s veiw­point in­creas­ingly re­lies on:

1) op­ti­miza­tion dae­mons (aka “in­ner op­ti­miz­ers”)

2) ad­ver­sar­ial ex­am­ples (i.e. cur­rent ML sys­tems seem to learn su­perfi­cially similar but deeply flawed ver­sions of our con­cepts)

TBC, I think these are quite good ar­gu­ments, and I per­son­ally feel like I’ve come to ap­pre­ci­ate them much more as well over the last sev­eral years. But I con­sider them far from con­clu­sive, due to our cur­rent lack of knowl­edge/​un­der­stand­ing.

One thing I didn’t quite agree with in the talk: I think he makes a fairly gen­eral case against try­ing to im­pact the far fu­ture. I think the mag­ni­tude of im­pact and un­cer­tainty we have about the di­rec­tion of im­pact mostly can­cel each other out, so even if we are highly un­cer­tain about what effects our ac­tions will have, it’s of­ten still worth mak­ing guesses and us­ing them to in­form our de­ci­sions. He ba­si­cally ac­knowl­edges this.