What do you think have been the most important applications of UDT or other decision theories to alignment?
I see the application as indirect at this point, basically showing that decision theory is hard and we’re unlikely to get it right without an AI pause/stop. See these two posts to get a better sense of what I mean:
https://www.lesswrong.com/posts/JSjagTDGdz2y6nNE3/on-the-purposes-of-decision-theory-research
https://www.lesswrong.com/posts/wXbSAKu2AcohaK2Gt/udt-shows-that-decision-theory-is-more-puzzling-than-ever
What do you think have been the most important applications of UDT or other decision theories to alignment?
I see the application as indirect at this point, basically showing that decision theory is hard and we’re unlikely to get it right without an AI pause/stop. See these two posts to get a better sense of what I mean:
https://www.lesswrong.com/posts/JSjagTDGdz2y6nNE3/on-the-purposes-of-decision-theory-research
https://www.lesswrong.com/posts/wXbSAKu2AcohaK2Gt/udt-shows-that-decision-theory-is-more-puzzling-than-ever