One assumption that stands out to me as a little questionable is the idea that Cindy will, with infinite simulated time to think, eventually manage to come up with a solution to the alignment problem. (This is compounded by the fact that she’s regularly brain-wiped and can only preserve insights by cramming them into the 1 gigabyte of scratch paper afforded to her.)
1GB of text is a lot. Naively, that’s a billion letters, much more if you use compression. Or you could maybe just do some kind of magic with the question containing a link to a wiki on the (simulated) internet?
If you have infinite time, you can go the monkeys on typewriters route—one of them will come up with something decent, unless an egregore gets them, or something. Though that’s very unlikely to be needed—assuming that alignment is solvable by a human level intelligence (this is doing a lot of work), then it should eventually be solved.
One assumption that stands out to me as a little questionable is the idea that Cindy will, with infinite simulated time to think, eventually manage to come up with a solution to the alignment problem. (This is compounded by the fact that she’s regularly brain-wiped and can only preserve insights by cramming them into the 1 gigabyte of scratch paper afforded to her.)
1GB of text is a lot. Naively, that’s a billion letters, much more if you use compression. Or you could maybe just do some kind of magic with the question containing a link to a wiki on the (simulated) internet?
If you have infinite time, you can go the monkeys on typewriters route—one of them will come up with something decent, unless an egregore gets them, or something. Though that’s very unlikely to be needed—assuming that alignment is solvable by a human level intelligence (this is doing a lot of work), then it should eventually be solved.