Sometimes I read people saying that Yudkowsky-2008 didn’t say this, the post wasn’t about that, and so forth, including Yudkowsky himself. Not with evidence, not with a better reading, just a denial. Perhaps people are overestimating how accurately their brain has maintained a model of what Yudkowsky wrote 10+ years ago. If Alice read the sequences in 2008 and Bob read the sequences in 2024, then Bob has a better model of what the sequences said. Evidence and arguments screen off authority.
But more importantly to me (here come the feelings), these defensive anti-interpretations of the sequences are boring and narrow and ugh. By positing that multiple apparently literate people misread the sequences both at the time (read the old comments), and today (read the new comments), they paint a picture of young Yudkowsky as a bad writer who attracted bad readers.
As I read it, the Hidden Complexity of Wishes is a glorious parable about AI and genies that paints graphic images of failure cases and invites both thought and imagination from the reader. As Yudkowsky-2024 tells me to read it, it is just making a point about the algorithmic complexity of human values. Yeah, I deny that, Death of the Author and all.
Likewise, as I read it, Magical Categories is about, well, categories. Categories that matter for capabilities and for alignment and for humans. It’s part of a network of rationalist thought that has ripples today in discussions about gender, adversarial examples, natural abstractions, and more. As others read it, Magical Categories is always in every instance talking about getting a shape into the AI’s preferences, never some other thing.
Epistemic status: minimal. Mostly feelings.
Sometimes I read people saying that Yudkowsky-2008 didn’t say this, the post wasn’t about that, and so forth, including Yudkowsky himself. Not with evidence, not with a better reading, just a denial. Perhaps people are overestimating how accurately their brain has maintained a model of what Yudkowsky wrote 10+ years ago. If Alice read the sequences in 2008 and Bob read the sequences in 2024, then Bob has a better model of what the sequences said. Evidence and arguments screen off authority.
But more importantly to me (here come the feelings), these defensive anti-interpretations of the sequences are boring and narrow and ugh. By positing that multiple apparently literate people misread the sequences both at the time (read the old comments), and today (read the new comments), they paint a picture of young Yudkowsky as a bad writer who attracted bad readers.
As I read it, the Hidden Complexity of Wishes is a glorious parable about AI and genies that paints graphic images of failure cases and invites both thought and imagination from the reader. As Yudkowsky-2024 tells me to read it, it is just making a point about the algorithmic complexity of human values. Yeah, I deny that, Death of the Author and all.
Likewise, as I read it, Magical Categories is about, well, categories. Categories that matter for capabilities and for alignment and for humans. It’s part of a network of rationalist thought that has ripples today in discussions about gender, adversarial examples, natural abstractions, and more. As others read it, Magical Categories is always in every instance talking about getting a shape into the AI’s preferences, never some other thing.
No thanks. Where recursive justification hits bottom is, I read LessWrong with my brain, it’s the only one I have.