Maybe there is something deeper you are trying to say
But really, since we’re making the implicit explicit what I mean is that the book is bad, with the humor being that it’s sufficiently bad to require this.
I’m actually genuinely quite disappointed, I was hoping it would be the definitive contemporary edition of the MIRI argument in the vein of Bostrom 2014. Instead I will still have to base anything I write on Bostrom 2014, the Arbital corpus, misc Facebook posts, sporadic LessWrong updates, and podcast appearances.
This isn’t just me thinking disagreement means it’s bad either. In the vast majority of places I would say it’s bad I agree with the argument it’s trying to make and find myself flabbergasted it would be made this way. The prime number of stones example for “demonstrating” fragility of value is insane, like actually comes off as green ink or GPT base model output. It seems to just take it for granted that the reader can obviously think of a prime number stone type of intrinsic value in humans, and since I can’t think of one offhand (sexual features?) I have to imagine most readers can’t either. It also doesn’t seem to consider that the more arbitrary and incompressible a value the less obviously important it is to conserve. A human is a monkey with a sapient active learning system and more and more of our expressed preferences are sapience maximizing over time. I understand the point that it’s trying to make, that yes obviously if you have a paperclipper it will not suddenly decide to be something other than a paperclipper, but if I didn’t already believe that I would find this argument to be absurd and off-putting.
So far as I can tell from jumping around in it, the entire book is like this.
This is a valid line of critique but seems moderately undercut by its prepublication endorsements, which suggest that the arguments landed pretty ok. Maybe they will land less well on the rest of the book’s target audience?
(re: Said & MIRI housecleaning: Lightcone and MIRI are separate organizations and MIRI does not moderate LessWrong. You might try to theorize that Habryka, the person who made the call to ban Said back in July, was attempting to do some 4d-chess PR optimization on MIRI’s behalf months ahead of time, but no, he was really nearly banned multiple times over the years and he was finally banned this time because Habryka changed his mind after the most recent dust-up. Said practically never commented on AI-related subjects, so it’s not even clear what the “upside” would’ve been. From my perspective this type of thinking resembles the constant noise on e.g. HackerNews about how [tech company x] is obviously doing [horrible thing y] behind-the-scenes, which often aren’t even in the company’s interests, and generally rely on assumptions that turn out to be false.)
My honest impression, though I could be wrong and didn’t analyze the prepublication reviews in detail, is that there is very much demand for this book in the sense that there’s a lot of people who are worried about AI for agent foundations shaped reasons and want an introduction they can give to their friends and family who don’t care that much.
For example I think this review from Matt Yglesias makes the point fairly explicit? He obviously has a preexisting interest in this subject and is endorsing the book because he wants the subject to get more attention, that doesn’t necessarily mean that the book is good. I in fact agree with a lot of the books basic arguments but think I would not be remotely persuaded by this presentation if I wasn’t already inclined to agree.
there is very much demand for this book in the sense that there’s a lot of people who are worried about AI for agent foundations shaped reasons and want an introduction they can give to their friends and family who don’t care that much.
This is true, but many of the surprising prepublication reviews are from people who I don’t think were already up-to-date on these AI x-risk arguments (or at least hadn’t given any prior public indication of their awareness, unlike Matt Y).
I am dismayed but not surprised, given the authors. I’d love to see the version edited by JDP’s mind(s) and their tools. I’m almost certain it would be out of anyone’s price range, but what would it cost to buy JDP+AI hours sufficient to produce an edited version?
I also have been trying to communicate it better, from the perspective of someone who actually put in the hours watching the arxiv feed. I suspect you’d do it better than I would. But, some ingredients I’d hope to see you ingest[ed already] for use:
But really, since we’re making the implicit explicit what I mean is that the book is bad, with the humor being that it’s sufficiently bad to require this.
I’m actually genuinely quite disappointed, I was hoping it would be the definitive contemporary edition of the MIRI argument in the vein of Bostrom 2014. Instead I will still have to base anything I write on Bostrom 2014, the Arbital corpus, misc Facebook posts, sporadic LessWrong updates, and podcast appearances.
This isn’t just me thinking disagreement means it’s bad either. In the vast majority of places I would say it’s bad I agree with the argument it’s trying to make and find myself flabbergasted it would be made this way. The prime number of stones example for “demonstrating” fragility of value is insane, like actually comes off as green ink or GPT base model output. It seems to just take it for granted that the reader can obviously think of a prime number stone type of intrinsic value in humans, and since I can’t think of one offhand (sexual features?) I have to imagine most readers can’t either. It also doesn’t seem to consider that the more arbitrary and incompressible a value the less obviously important it is to conserve. A human is a monkey with a sapient active learning system and more and more of our expressed preferences are sapience maximizing over time. I understand the point that it’s trying to make, that yes obviously if you have a paperclipper it will not suddenly decide to be something other than a paperclipper, but if I didn’t already believe that I would find this argument to be absurd and off-putting.
So far as I can tell from jumping around in it, the entire book is like this.
This is a valid line of critique but seems moderately undercut by its prepublication endorsements, which suggest that the arguments landed pretty ok. Maybe they will land less well on the rest of the book’s target audience?
(re: Said & MIRI housecleaning: Lightcone and MIRI are separate organizations and MIRI does not moderate LessWrong. You might try to theorize that Habryka, the person who made the call to ban Said back in July, was attempting to do some 4d-chess PR optimization on MIRI’s behalf months ahead of time, but no, he was really nearly banned multiple times over the years and he was finally banned this time because Habryka changed his mind after the most recent dust-up. Said practically never commented on AI-related subjects, so it’s not even clear what the “upside” would’ve been. From my perspective this type of thinking resembles the constant noise on e.g. HackerNews about how [tech company x] is obviously doing [horrible thing y] behind-the-scenes, which often aren’t even in the company’s interests, and generally rely on assumptions that turn out to be false.)
My honest impression, though I could be wrong and didn’t analyze the prepublication reviews in detail, is that there is very much demand for this book in the sense that there’s a lot of people who are worried about AI for agent foundations shaped reasons and want an introduction they can give to their friends and family who don’t care that much.
https://x.com/mattyglesias/status/1967765768948306275?s=46
For example I think this review from Matt Yglesias makes the point fairly explicit? He obviously has a preexisting interest in this subject and is endorsing the book because he wants the subject to get more attention, that doesn’t necessarily mean that the book is good. I in fact agree with a lot of the books basic arguments but think I would not be remotely persuaded by this presentation if I wasn’t already inclined to agree.
Obviously just one example, but Schneier has generally been quite skeptical, and he blurbed the book.
This is true, but many of the surprising prepublication reviews are from people who I don’t think were already up-to-date on these AI x-risk arguments (or at least hadn’t given any prior public indication of their awareness, unlike Matt Y).
I am dismayed but not surprised, given the authors. I’d love to see the version edited by JDP’s mind(s) and their tools. I’m almost certain it would be out of anyone’s price range, but what would it cost to buy JDP+AI hours sufficient to produce an edited version?
I also have been trying to communicate it better, from the perspective of someone who actually put in the hours watching the arxiv feed. I suspect you’d do it better than I would. But, some ingredients I’d hope to see you ingest[ed already] for use:
https://www.lesswrong.com/posts/9kNxhKWvixtKW5anS/you-are-not-measuring-what-you-think-you-are-measuring
https://www.lesswrong.com/posts/gebzzEwn2TaA6rGkc/deep-learning-systems-are-not-less-interpretable-than-logic
https://www.lesswrong.com/posts/Rrt7uPJ8r3sYuLrXo/selection-has-a-quality-ceiling
probably some other wentworth stuff
I thought I had more to link but it’s not quite coming to mind. oh right, this one! https://www.lesswrong.com/posts/evYne4Xx7L9J96BHW/video-and-transcript-of-talk-on-can-goodness-compete
I have now written a review of the book, which touches on some of what you’re asking about. https://www.lesswrong.com/posts/mztwygscvCKDLYGk8/jdp-reviews-iabied