I love this analogy, I have been observing this split mode of viewing the challenge of alignment but haven’t felt like I had a good way of discussing it.
I also worry that puzzle-mode is somewhat more likely than RPG-mode, and that I feel like a lot of researchers are exclusively focused on RPG-mode.
I think that even if we are in puzzle-mode world, there is a way to ‘level up’ puzzle-solving. You can devise conceptual tools for patterns of puzzles. I think that in actual puzzle games, this is a big part of how I go from able to solve the easy puzzles to able to solve the hard puzzles. I build up a mental toolkit for tackling the types of puzzles I solve in the early game, and this mental toolkit enables me to tackle harder puzzles successfully.
Of course, to the outside world, building a mental toolkit looks a lot like just successfully solving easy puzzles and then starting off into space.
I worry that this contributes to why researchers are more focused on the RPG-mode approach. Dreaming up frameworks for solving speculative future puzzles may just be too hard to look like ‘a hard working scientist making measurable progress ’ while doing.
One good example of puzzle mode is that you mentioned that research groups appear to be closing in on a way to let AI solve arbitrarily long/complicated/large problems in it’s memory.
If any of the research does work, then you essentially have a universal turing machine, which is equivalent to AGI.