Vladimir_Nesov comments on Rant on Problem Factorization for Alignment

Vladimir_Nesov 7 Aug 2022 0:50 UTC
3 points
1
This seems true, but not central. It’s more like generalized prompt engineering, bureaucracies amplify certain aspects of behavior to generate data for models better at (or more robustly bound by) those aspects. So this is useful even if the building blocks are already AGIs, except in how deceptive alignment could make that ineffective. The central use is to amplify alignment properties of behavior with appropriate bureaucracies, retraining the models with their output.

If applied to capability at solving problems, this is a step towards AGI (and marks the approach as competitive). My impression is that Paul believes this application feasible to a greater extent than most other people, and by extension expects other bureaucracies to do sensible things at a lower capability of an agent. But this is more plausible than a single infinite bureaucracy working immediately with a weak agent, because all it needs is to improve things at each IDA cycle, making agents in the bureaucracy a little bit more capable and aligned, even if they are still a long way from solving difficult problems.