I understand how one could come away from this piece with that impression, but I don’t think we are making the particular cognitive mistake you are pointing to.
Does stockfish “want” to win at chess, in the sense that I want a piece of pizza? Of course not! However, it still pursues that end doggedly, just as a human would if they desperately wanted to win the game (and were superhuman at chess). We’re really not, at all, dragging in consciousness or human-like traits here; we just, as a species, don’t really have elegant language for describing ‘pursuit of an end’ that isn’t somehow tied to a bunch of messy conceptual stuff about people.
This is an example of one of the things it’s really hard to safeguard against readers misunderstanding. The book does a much better job at this than this piece, and does a better job than I have here in this comment, but it also has 200+ pages in which do it.
Thanks for getting back to me. Your pizza example perfectly captures what I’ve been grappling with—I’m still trying to fully wrap my head around WHY an AI would “want” to deceive us or plot our extinction?? I also appreciate (and agree) that there’s no need to invoke human-like traits, agency, or consciousness here, since we’re talking about something entirely different from the way humans pursue goals. That said, I think—as you point out—the fact that we lack precise language for describing this kind of “goal pursuit” can lead to misunderstandings (for me and perhaps others), and more importantly, as you mention in the article, could make it easier for some to dismiss x-risk concerns. I’m looking forward to reading the book to see how you navigate this!
I understand how one could come away from this piece with that impression, but I don’t think we are making the particular cognitive mistake you are pointing to.
Does stockfish “want” to win at chess, in the sense that I want a piece of pizza? Of course not! However, it still pursues that end doggedly, just as a human would if they desperately wanted to win the game (and were superhuman at chess). We’re really not, at all, dragging in consciousness or human-like traits here; we just, as a species, don’t really have elegant language for describing ‘pursuit of an end’ that isn’t somehow tied to a bunch of messy conceptual stuff about people.
This is an example of one of the things it’s really hard to safeguard against readers misunderstanding. The book does a much better job at this than this piece, and does a better job than I have here in this comment, but it also has 200+ pages in which do it.
Thanks for getting back to me. Your pizza example perfectly captures what I’ve been grappling with—I’m still trying to fully wrap my head around WHY an AI would “want” to deceive us or plot our extinction?? I also appreciate (and agree) that there’s no need to invoke human-like traits, agency, or consciousness here, since we’re talking about something entirely different from the way humans pursue goals. That said, I think—as you point out—the fact that we lack precise language for describing this kind of “goal pursuit” can lead to misunderstandings (for me and perhaps others), and more importantly, as you mention in the article, could make it easier for some to dismiss x-risk concerns. I’m looking forward to reading the book to see how you navigate this!