Logan Zoellner comments on Rant on Problem Factorization for Alignment

Logan Zoellner 6 Aug 2022 16:35 UTC
4 points
1
Since I recently wrote an article endorsing Factorization as an alignment approach, I feel like I should respond here.
1. Everyone who proposes Factorization agrees there is a tradeoff between factorization and efficiency. The question is, how bad is that tradeoff?
2. Factorization is not a solution to the problem of general intelligence. However there are a lot of problems that we should reasonably expect can be factorized
3. Each human having 5 minutes with Google doc does not seem like a good way to factorize problems
4. John seems wrongly pessimistic about the “Extremely Long Jury Trial”. We know from math that “you prove something, I check your work” is an extremely powerful framework. I would expect this to be true in real life as well.
- Vaniver 6 Aug 2022 18:20 UTC
  6 points
  0
  Parent
  Factorization is not a solution to the problem of general intelligence.
  Huh, really? I think my impression from talking to Paul over the years was that it sort of was. [Like, if your picture of the human brain is that it’s a bunch of neurons factorizing the problem of being a human, this sort of has to work.]
  - Vladimir_Nesov 7 Aug 2022 0:50 UTC
    3 points
    1
    Parent
    This seems true, but not central. It’s more like generalized prompt engineering, bureaucracies amplify certain aspects of behavior to generate data for models better at (or more robustly bound by) those aspects. So this is useful even if the building blocks are already AGIs, except in how deceptive alignment could make that ineffective. The central use is to amplify alignment properties of behavior with appropriate bureaucracies, retraining the models with their output.
    
    If applied to capability at solving problems, this is a step towards AGI (and marks the approach as competitive). My impression is that Paul believes this application feasible to a greater extent than most other people, and by extension expects other bureaucracies to do sensible things at a lower capability of an agent. But this is more plausible than a single infinite bureaucracy working immediately with a weak agent, because all it needs is to improve things at each IDA cycle, making agents in the bureaucracy a little bit more capable and aligned, even if they are still a long way from solving difficult problems.