I watched the first episode of Pluto (about 1 hour long), and the second part of it is entirely about a blind old pianist and his robot butler, North N.2. I liked that part a lot and wanted to share a couple of interesting things that are in it (free of important spoilers):
1. The pianist kinda hates the robot, he’s rude to it, and he’s convinced the robot can’t “truly” sing or play piano. Everything music-wise that comes out of the robot must be soulless.
2. The robot doesn’t mind the rudeness, but it’s also slightly adversarial to the pianist. It has its own goal of wanting to learn the piano. Despite the pianist’s request that the robot not touch the piano, it does so anyway and repeatedly asks the pianist to teach it to play. But the robot clearly has the pianist’s interest at heart too. It goes out of its way to help him, both in straightforward robot-butler ways and in more nuanced, unexpected ways that require more independent agency.
3. The robot’s adversarialness ends up helping the old pianist, too.
Part of why I liked the episode is the robot’s alignment. It has its own thing going on, but it also has the pianist’s interest at heart, and it’s also possible for it to disobey the pianist, do its own thing, and for both of them to be better off anyway.
Opinions about the plausibility of achieving this type of alignment may vary, but as a thing to aim for, it seems quite decent? You get AI that cares about humans, but it’s also endowed with its own independence. I don’t think it’s that far from what Anthropic has been trying to do lately, either. They seem to care about what their AIs desire.
The whole part is just quite beautiful for other reasons that are less relevant here, and I would definitely recommend it. I didn’t like the first part of the episode that much, though, which is almost completely unrelated and has different characters.
I watched the first episode of Pluto (about 1 hour long), and the second part of it is entirely about a blind old pianist and his robot butler, North N.2. I liked that part a lot and wanted to share a couple of interesting things that are in it (free of important spoilers):
1. The pianist kinda hates the robot, he’s rude to it, and he’s convinced the robot can’t “truly” sing or play piano. Everything music-wise that comes out of the robot must be soulless.
2. The robot doesn’t mind the rudeness, but it’s also slightly adversarial to the pianist. It has its own goal of wanting to learn the piano. Despite the pianist’s request that the robot not touch the piano, it does so anyway and repeatedly asks the pianist to teach it to play. But the robot clearly has the pianist’s interest at heart too. It goes out of its way to help him, both in straightforward robot-butler ways and in more nuanced, unexpected ways that require more independent agency.
3. The robot’s adversarialness ends up helping the old pianist, too.
Part of why I liked the episode is the robot’s alignment. It has its own thing going on, but it also has the pianist’s interest at heart, and it’s also possible for it to disobey the pianist, do its own thing, and for both of them to be better off anyway.
Opinions about the plausibility of achieving this type of alignment may vary, but as a thing to aim for, it seems quite decent? You get AI that cares about humans, but it’s also endowed with its own independence. I don’t think it’s that far from what Anthropic has been trying to do lately, either. They seem to care about what their AIs desire.
The whole part is just quite beautiful for other reasons that are less relevant here, and I would definitely recommend it. I didn’t like the first part of the episode that much, though, which is almost completely unrelated and has different characters.