Given the rate of progress in AutoGPT-like approaches, should we reconsider Paul Christiano’s Iterated Distillation and Amplitication (IDA) agenda as potentially central to the alignment of transformative ML systems?
For contex on IDA and AutoGPT:
https://www.lesswrong.com/tag/iterated-amplification
https://github.com/Torantulino/Auto-GPT
https://www.lesswrong.com/posts/dcoxvEhAfYcov2LA6/agentized-llms-will-change-the-alignment-landscape
[Question] Should AutoGPT update us towards researching IDA?
Given the rate of progress in AutoGPT-like approaches, should we reconsider Paul Christiano’s Iterated Distillation and Amplitication (IDA) agenda as potentially central to the alignment of transformative ML systems?
For contex on IDA and AutoGPT:
https://www.lesswrong.com/tag/iterated-amplification
https://github.com/Torantulino/Auto-GPT
https://www.lesswrong.com/posts/dcoxvEhAfYcov2LA6/agentized-llms-will-change-the-alignment-landscape