It seems like Amplification could, given enough time, systematically search through all possible solutions (ie. generate all bit sequences, turn them into strings, evaluate whether they are a solution). But the problem with that is that it will likely yield an misaligned solution (assuming the evaluation of solutions is imperfect).
Well I was thinking that before this alignment problem could even happen, a brute force search would be exponentially expensive so Amplification wouldn’t work at all in practice on a question that requires “creative insight”.
My understanding of what I’ve read about Paul’s approach suggests the solution to both the translation problem and creativity would be extract any search heuristics/conceptual framework algorithms that humans do have access to, and then still limit the search, sacrificing solution quality but maintaining corrigibility.
My concern is that this won’t be competitive with other AGI approaches that don’t try to maintain alignment/corrigibility, for example using reinforcement learning to “raise” an AGI through a series of increasingly complex virtual environments, and letting the AGI incrementally build its own search heuristics and conceptual framework algorithms.
BTW, thanks for trying to understand Paul’s ideas and engaging in these discussions. It would be nice to get a critical mass of people to understand these ideas well enough to sustain discussions and make progress without Paul having to be present all the time.
Well I was thinking that before this alignment problem could even happen, a brute force search would be exponentially expensive so Amplification wouldn’t work at all in practice on a question that requires “creative insight”.
My concern is that this won’t be competitive with other AGI approaches that don’t try to maintain alignment/corrigibility, for example using reinforcement learning to “raise” an AGI through a series of increasingly complex virtual environments, and letting the AGI incrementally build its own search heuristics and conceptual framework algorithms.
BTW, thanks for trying to understand Paul’s ideas and engaging in these discussions. It would be nice to get a critical mass of people to understand these ideas well enough to sustain discussions and make progress without Paul having to be present all the time.