Thane Ruthenis comments on AGI Ruin: A List of Lethalities

Thane Ruthenis 15 Jun 2022 23:00 UTC
4 points
1
Mmm. If the CYOA idea is implemented as a quirky-but-primarily-educational article, then sure, integrating the “adapt to feedback” capability like this would be worthwhile. Might also attach a monetary prize to submitting valuable ideas, by analogy to the ELK contest.
For a game-like implementation, where you’d be playing it partly for the fun/challenge of it, that wouldn’t suffice. The feedback loop’s too slow, and there’d be an ugh-field around the expectation that submitting a proposal would then require arguing with the moderators about it, defending it. It wouldn’t feel like a game.
It’d make the upkeep cost pretty high, too, without a corresponding increase in the pay-off.
Just making it open-ended might work, even without the moderation engine? Track how many branches the player explored, once they’ve explored a lot (i. e., are expected to “get” the full scope of the problem), there appears an option for something like “I really don’t know what to do, but we should keep trying”, leading to some appropriately-subtle and well-integrated call to support alignment research?
Not excited about this approach either.