Huh! I’ll have to listen to the podcast, but my first response is this is really interesting and seems likely to give me a sense of where he’s at regarding asymptotic alignment, and how to communicate my model differences to people in his cluster. I can totally see how an easy coarse-graining of what MIRI is saying, “it’s impossible to make safe”, perspective would feel annoying to someone who feels they have a research framework for how to succeed. I do think he’s missing something, something along the lines of needing to be strongly robust to competitive pressure being inclined to redirect a mind towards ruthless strategic power-seeking behavior (see also “can good compete?”), but it’s interesting to see the ways he misrepresents what he’s hearing, because they seem to me to indicate possible misunderstandings that, if made very clear, could communicate more of what MIRI is worried about in a way he might be more able to use to direct research at anthropic usefully. Excited to watch this!
Huh! I’ll have to listen to the podcast, but my first response is this is really interesting and seems likely to give me a sense of where he’s at regarding asymptotic alignment, and how to communicate my model differences to people in his cluster. I can totally see how an easy coarse-graining of what MIRI is saying, “it’s impossible to make safe”, perspective would feel annoying to someone who feels they have a research framework for how to succeed. I do think he’s missing something, something along the lines of needing to be strongly robust to competitive pressure being inclined to redirect a mind towards ruthless strategic power-seeking behavior (see also “can good compete?”), but it’s interesting to see the ways he misrepresents what he’s hearing, because they seem to me to indicate possible misunderstandings that, if made very clear, could communicate more of what MIRI is worried about in a way he might be more able to use to direct research at anthropic usefully. Excited to watch this!