Question about this part:
I do think MIRI “at least temporarily gave up” on personally executing on technical research agendas, or something like that, but, that’s not the only type of output.
So, I’m sure various people have probably thought about this a lot, but just to ask the obvious dumb question: Are we sure that this is even a good idea?
Let’s say the hope is that at some time in the future, we’ll stumble across an Amazing Insight that unblocks progress on AI alignment. At that point, it’s probably good to be able to execute quickly on turning that insight into actual mathematics (and then later actual corrigible AI designs, and then later actual code). It’s very easy for knowledge of “how to do things” to be lost, particularly technical knowledge. [1] Humanity loses this knowledge on a generational timescale, as people die, but it’s possible for institutions to lose knowledge much more quickly due to turnover. All that just to say: Maybe MIRI should keep doing some amount of technical research, just to “stay in practice”.
My general impression here is that there’s plenty of unfinished work in agent foundations and decision theory, things like: How do we actually write a bounded program that implements something like UDT? How do we actually do calculations with logical decision theories such that we can get answers out for basic game-theory scenarios (even something as simple as the ultimatum game is unsolved IIRC)? What are some common-sense constraints the program-value-functions should obey (eg. how should we value a program that simulates multiple other programs?)? These all seem like they are likely to be relevant to alignment, and also intrinsically worth doing.
[1] This talk is relevant: https://www.youtube.com/watch?v=ZSRHeXYDLko
Ah got it, thanks for the reply!