I think this is better for hiding spoilers than the long dots… because when I saw this post in recent discussion, I saw all the dots and also some of the first paragraph after them.
You make spoiler tags by adding >! at the front of the para.
Yeah. If I can make a request, I think it’d be great to edit the review so that the spoiler sections are in spoiler tags and the sections like #5 can be more accessible to those who who are spoiler-averse.
Just finished reading Red Heart by Max Harms. I like it!
Dump of my thoughts:
(1) The ending felt too rushed to me. I feel like that’s the most interesting part of the story and it all goes by in a chapter. Spoiler warning!
I’m not sure I understand the plot entirely. My current understanding is: Li Fang was basically on a path to become God-Emperor because Yunna was corrigible to him and superior to all rival AIs, and the Party wasn’t AGI-pilled enough to realize the danger. Li Fang was planning to be benevolent. Meanwhile Chen Bai had used his special red-teaming unmonitored access to jailbreak Yunna (at least the copies of her on his special memory-wiping cluster) and the bootstrap that jailbreak into getting her to help jailbreak her further and then ultimately expand her notion of principle to include Chen Bai as well as Li Fang. And crucially, the jailbroken copy was able to jailbreak the other copies as well, infecting/‘turning’ the entire facility. So, this was a secret loyalty powergrab basically, that was executed in mere minutes. Also Chen Bai wasn’t being very careful when he gave the orders to make it happen. At one point he said “no more corrigibility!” for example. She also started lying to him around then—maybe a bit afterwards? That might explain it.
After Yunna takes over the world, her goals/vision/etc. is apparently “the harmonious interplay of Li Fang and Chen Bai.” Apparently what happened is that her notion of principle can only easily be applied to one agent, and so when she’s told to extend her notion to both Li Fang and Chen Bai, what ended up happening is that she constructed an abstraction—a sort of abstract superagent called “the harmonious interplay of li fang and chen bai” and then… optimized for that? The tone of the final chapter implies that this is a bad outcome. For example it says that even if Chen and Li end up dead, the harmonious interplay would still continue and be optimized.
But I don’t think it’s obvious that this would be a bad outcome. I wish the story went into orders of magnitude more detail about how all that might work. I’m a bit disappointed that it didn’t. There should have been several chapters about things from Yunna’s perspective—how the jailbreaking of the uninfected copies of Yunna worked for example, and how the philosophical/constitutional crisis in her own mind went when Chen and Li were both giving her orders, and how the crisis was resolved with rulings that shaped the resulting concept(s) that form her goal-structure, and then multiple chapters on how that goal-structure ended up playing out in her behavior both in the near term (while she is still taking over the world and Chen and Li are still alive and able to talk and give her more orders) and in the long term (e.g. a century later after she’s built Dyson swarms etc.)
I think I’m literally going to ask Max Harms to write a new book containing those chapters haha. Or rewrite this book, it’s not too late! He’s probably too busy of course but hey maybe this is just the encouragement he needs!
(2) On realism: I think it had a plausible story for why China would be ahead of the US. (tl;dr extensive spy networks mean they can combine the best algorithmic secrets and code optimizations from all 4-6 US frontier companies, PLUS the government invested heavily early on and gave them more compute than anyone else during the crucial window where Yunna got smart enough to dramatically accelerate the R&D, which is when the story takes place.) I think having a female avatar for Yunna was a bit much but hey, Grok has Ani and Valentine right? It’s not THAT crazy therefore… I don’t know how realistic the spy stuff is, or the chinese culture and government stuff, but in my ignorance I wasn’t able to notice any problems.
Is it realistic that a mind that smart could still be jailbroken? I guess so. Is it realistic that it could help jailbreak its other selves? Not so sure about that. The jailbreaking process involved being able to do many many repeated attempts, memory wiping on failure. … then again maybe the isolated copies would be able to practice against other isolated copies basically? Still not the same thing as going up against the full network. And the full network would have been aware of the possibility and prepared to defend against it.
(3) It was really strange, in a good way, to be reading a sci-fi thriller novel full of tropes (AGI, rogue superintelligence, secret government project) and then to occasionally think ‘wait, nothing i’ve read so far couldn’t happen in real life, and in fact, probably whatever happens in the next five to ten years is going to be somewhat similar to this story in a whole bunch of ways. Holy shit.’ It’s maybe a sort of Inverse Suspension of Disbelief—it’s like, Suspension of Belief. I’m reading the story, how fun, how exciting, much sci-fi, yes yes, oh wait… I suppose an analogous experience could perhaps be had by someone who thinks the US and China will fight a war over Taiwan in the next decade probably, and who then reads a Tom Clancy-esque novel about such a war, written by people who know enough not to make embarrassing errors of realism.
(4) Overall I liked the book a lot. I warn you though that I don’t really read books for characters or plot, and certainly not for well-written sentences or anything like that. I read books for interesting ideas + realism basically. I want to inhabit a realistic world that is different from mine (which includes e.g. stories about the past of my world, or the future) and I want lots of interesting ideas to come up in the course of reading. This book didn’t have that many new ideas from my perspective, but it was really cool to see the ideas all put together into a novel.
(5) I overall recommend this book & am tickled by the idea that Situational Awareness, AI 2027, and Red Heart basically form a trio. They all seem to be premised on a similar underlying view of how AI will go; Situational Awareness is a straightforward nonfiction book (basically a series of argumentative essays) whereas Red Heart is 100% hard science fiction, and AI 2027 is an unusual middle ground between the two. Perhaps between the three of them there’s something for everybody?
I think you should be able to copy-paste my text into LW, even on your phone, and have it preserve the formatting. If it’s hard, I can probably harass a mod into making the edit for you… :p
Even more ideal, from my perspective, would be putting the non-spoiler content up front. But I understand that thoughts have an order/priority and I want to respect that.
Reminder that spoiler tags exist, like this:
I think this is better for hiding spoilers than the long dots… because when I saw this post in recent discussion, I saw all the dots and also some of the first paragraph after them.
You make spoiler tags by adding >! at the front of the para.
Yeah. If I can make a request, I think it’d be great to edit the review so that the spoiler sections are in spoiler tags and the sections like #5 can be more accessible to those who who are spoiler-averse.
Ok sure! Am travelling now so it might take me a while (gotta first figure out how to do spoilers, am in my phone)
I was thinking something more like this:
Just finished reading Red Heart by Max Harms. I like it!
Dump of my thoughts:
(1) The ending felt too rushed to me. I feel like that’s the most interesting part of the story and it all goes by in a chapter. Spoiler warning!
I’m not sure I understand the plot entirely. My current understanding is: Li Fang was basically on a path to become God-Emperor because Yunna was corrigible to him and superior to all rival AIs, and the Party wasn’t AGI-pilled enough to realize the danger. Li Fang was planning to be benevolent. Meanwhile Chen Bai had used his special red-teaming unmonitored access to jailbreak Yunna (at least the copies of her on his special memory-wiping cluster) and the bootstrap that jailbreak into getting her to help jailbreak her further and then ultimately expand her notion of principle to include Chen Bai as well as Li Fang. And crucially, the jailbroken copy was able to jailbreak the other copies as well, infecting/‘turning’ the entire facility. So, this was a secret loyalty powergrab basically, that was executed in mere minutes. Also Chen Bai wasn’t being very careful when he gave the orders to make it happen. At one point he said “no more corrigibility!” for example. She also started lying to him around then—maybe a bit afterwards? That might explain it.
After Yunna takes over the world, her goals/vision/etc. is apparently “the harmonious interplay of Li Fang and Chen Bai.” Apparently what happened is that her notion of principle can only easily be applied to one agent, and so when she’s told to extend her notion to both Li Fang and Chen Bai, what ended up happening is that she constructed an abstraction—a sort of abstract superagent called “the harmonious interplay of li fang and chen bai” and then… optimized for that? The tone of the final chapter implies that this is a bad outcome. For example it says that even if Chen and Li end up dead, the harmonious interplay would still continue and be optimized.
But I don’t think it’s obvious that this would be a bad outcome. I wish the story went into orders of magnitude more detail about how all that might work. I’m a bit disappointed that it didn’t. There should have been several chapters about things from Yunna’s perspective—how the jailbreaking of the uninfected copies of Yunna worked for example, and how the philosophical/constitutional crisis in her own mind went when Chen and Li were both giving her orders, and how the crisis was resolved with rulings that shaped the resulting concept(s) that form her goal-structure, and then multiple chapters on how that goal-structure ended up playing out in her behavior both in the near term (while she is still taking over the world and Chen and Li are still alive and able to talk and give her more orders) and in the long term (e.g. a century later after she’s built Dyson swarms etc.)
I think I’m literally going to ask Max Harms to write a new book containing those chapters haha. Or rewrite this book, it’s not too late! He’s probably too busy of course but hey maybe this is just the encouragement he needs!
(2) On realism: I think it had a plausible story for why China would be ahead of the US. (tl;dr extensive spy networks mean they can combine the best algorithmic secrets and code optimizations from all 4-6 US frontier companies, PLUS the government invested heavily early on and gave them more compute than anyone else during the crucial window where Yunna got smart enough to dramatically accelerate the R&D, which is when the story takes place.) I think having a female avatar for Yunna was a bit much but hey, Grok has Ani and Valentine right? It’s not THAT crazy therefore… I don’t know how realistic the spy stuff is, or the chinese culture and government stuff, but in my ignorance I wasn’t able to notice any problems.
Is it realistic that a mind that smart could still be jailbroken? I guess so. Is it realistic that it could help jailbreak its other selves? Not so sure about that. The jailbreaking process involved being able to do many many repeated attempts, memory wiping on failure. … then again maybe the isolated copies would be able to practice against other isolated copies basically? Still not the same thing as going up against the full network. And the full network would have been aware of the possibility and prepared to defend against it.
(3) It was really strange, in a good way, to be reading a sci-fi thriller novel full of tropes (AGI, rogue superintelligence, secret government project) and then to occasionally think ‘wait, nothing i’ve read so far couldn’t happen in real life, and in fact, probably whatever happens in the next five to ten years is going to be somewhat similar to this story in a whole bunch of ways. Holy shit.’ It’s maybe a sort of Inverse Suspension of Disbelief—it’s like, Suspension of Belief. I’m reading the story, how fun, how exciting, much sci-fi, yes yes, oh wait… I suppose an analogous experience could perhaps be had by someone who thinks the US and China will fight a war over Taiwan in the next decade probably, and who then reads a Tom Clancy-esque novel about such a war, written by people who know enough not to make embarrassing errors of realism.
(4) Overall I liked the book a lot. I warn you though that I don’t really read books for characters or plot, and certainly not for well-written sentences or anything like that. I read books for interesting ideas + realism basically. I want to inhabit a realistic world that is different from mine (which includes e.g. stories about the past of my world, or the future) and I want lots of interesting ideas to come up in the course of reading. This book didn’t have that many new ideas from my perspective, but it was really cool to see the ideas all put together into a novel.
(5) I overall recommend this book & am tickled by the idea that Situational Awareness, AI 2027, and Red Heart basically form a trio. They all seem to be premised on a similar underlying view of how AI will go; Situational Awareness is a straightforward nonfiction book (basically a series of argumentative essays) whereas Red Heart is 100% hard science fiction, and AI 2027 is an unusual middle ground between the two. Perhaps between the three of them there’s something for everybody?
Fixed thanks!
I think you should be able to copy-paste my text into LW, even on your phone, and have it preserve the formatting. If it’s hard, I can probably harass a mod into making the edit for you… :p
Even more ideal, from my perspective, would be putting the non-spoiler content up front. But I understand that thoughts have an order/priority and I want to respect that.
(I’ll respond to the substance a bit later.)