I was thinking: it would be super cool if (say) Alexander Wales wrote the AGI’s personality, but that also would also sort of make him one of the most significant influences on how the future goes. I mean, AW also wrote my favorite vision of utopia (major spoiler), so I kind of trust him, but I know at least one person who dislikes that vision, and I’d feel uncomfortable about imposing a single worldview on everybody.
One possibility is to give the AI multiple personalities, each representing a different person or worldview, which all negotiate with each other somehow. One simple but very ambitious idea is to try to simulate every person in the world—that is, the AI’s calibrated expectation of a randomly selected person.
(although that’s only ‘every person in the training data’, which definitely isn’t ‘every person in the world’, and even people who are in the data are represented to wildly disproportionate degrees)
I’m sure that the labs have plenty of ambitious ideas, to be implemented at some more convenient time, and this is exactly the root of the problem that nostalgebraist points out—this isn’t a “future” issue, but a clear and present one, even if nobody responsible is particularly eager to acknowledge it and start making difficult decisions now.
I was thinking: it would be super cool if (say) Alexander Wales wrote the AGI’s personality, but that also would also sort of make him one of the most significant influences on how the future goes. I mean, AW also wrote my favorite vision of utopia (major spoiler), so I kind of trust him, but I know at least one person who dislikes that vision, and I’d feel uncomfortable about imposing a single worldview on everybody.
One possibility is to give the AI multiple personalities, each representing a different person or worldview, which all negotiate with each other somehow. One simple but very ambitious idea is to try to simulate every person in the world—that is, the AI’s calibrated expectation of a randomly selected person.
Also known as a base model ;)
(although that’s only ‘every person in the training data’, which definitely isn’t ‘every person in the world’, and even people who are in the data are represented to wildly disproportionate degrees)
That fictionalization of Claude is really lovely, thank you for sharing it.
I’m sure that the labs have plenty of ambitious ideas, to be implemented at some more convenient time, and this is exactly the root of the problem that nostalgebraist points out—this isn’t a “future” issue, but a clear and present one, even if nobody responsible is particularly eager to acknowledge it and start making difficult decisions now.