For example, I believe @abramdemski really wants to implement a version of UDT and @Vanessa Kosoy really wants to implement an IBP agent. They are both working on a normative theory which they recognize is currently slightly idealized or incomplete, but I believe that their plan routes through developing that theory to the point that it can be translated into code. Another example is the program synthesis community in computational cognitive science (e.g. Josh Tenenbaum, Zenna Tavares). They are writing functional programs to compete with deep learning right now.
For a criticism of this mindset, see my (previous in this sequence) discussion of why glass-box learners are not necessarily safer. Also, (relatedly) I suspect it will be rather hard to invent a nice paradigm that takes the lead from deep learning. However, I am glad people are working on it and I hope they succeed; and I don’t mean that in an empty way. I dabble in this quest myself—I even have a computational cognitive science paper.
For what it’s worth, IBP avoids the issue of glass-box learners not necessarily being safe by focusing on desiderata rather than specifically focusing on algorithms.
In particular, you could in principle prove stuff about black boxes, so long as the black box satisfied some desiderata rathet than trying to white box the algorithm and prove stuff on that.
I read the post and I see what you’re pointing at, but naively I am still left with the same impression. Perhaps more serious study of IB will change my mind.
For what it’s worth, IBP avoids the issue of glass-box learners not necessarily being safe by focusing on desiderata rather than specifically focusing on algorithms.
In particular, you could in principle prove stuff about black boxes, so long as the black box satisfied some desiderata rathet than trying to white box the algorithm and prove stuff on that.
@Steven Byrnes has talked about this before:
https://www.lesswrong.com/posts/SzrmsbkqydpZyPuEh/my-take-on-vanessa-kosoy-s-take-on-agi-safety
I read the post and I see what you’re pointing at, but naively I am still left with the same impression. Perhaps more serious study of IB will change my mind.