AI Alignment Writing Day 2019

On August 22nd 2019, all attendees of the MIRI Summer Fellows Program were given an entire day to write blogposts to the AI Alignment Forum with ideas they’d been thinking about. These are the posts that resulted, in chronological order.

An­nounce­ment: Writ­ing Day To­day (Thurs­day)

Mar­kets are Univer­sal for Log­i­cal Induction

Com­pu­ta­tional Model: Causal Di­a­grams with Symmetry

In­ten­tional Bucket Errors

Embed­ded Naive Bayes

Time Travel, AI and Trans­par­ent Newcomb

Log­i­cal Coun­ter­fac­tu­als and Propo­si­tion graphs, Part 1

[Question] Why so much var­i­ance in hu­man in­tel­li­gence?

Towards a mechanis­tic un­der­stand­ing of corrigibility

Log­i­cal Op­ti­miz­ers

De­con­fuse Your­self about Agency

Thoughts from a Two Boxer

Anal­y­sis of a Se­cret Hitler Scenario

The “Com­mit­ment Races” problem

[Question] Does Agent-like Be­hav­ior Im­ply Agent-like Ar­chi­tec­ture?

Redefin­ing Fast Takeoff

Vaniver’s View on Fac­tored Cognition

Embed­ded Reflection

Ta­boo­ing ‘Agent’ for Pro­saic Alignment

Creat­ing En­vi­ron­ments to De­sign and Test Embed­ded Agents

For­mal­is­ing de­ci­sion the­ory is hard

Vague Thoughts and Ques­tions about Agent Structures

Towards an In­ten­tional Re­search Agenda

Me­tal­ign­ment: De­con­fus­ing metaethics for AI al­ign­ment.

Tor­ture and Dust Specks and Joy—Oh my! or: Non-Archimedean Utility Func­tions as Pseu­do­graded Vec­tor Spaces

Soft take­off can still lead to de­ci­sive strate­gic advantage

Al­gorith­mic Similarity

Thoughts on Retriev­ing Knowl­edge from Neu­ral Networks

Parables of Con­straint and Ac­tu­al­iza­tion

When do util­ity func­tions con­strain?

Ac­tu­ally updating

Un­der­stand­ing understanding

Troll Bridge

Op­ti­miza­tion Provenance