Currently feeling happiness at this comment, but also sadness at the fact that Paul hasn’t already picked something he finds fitting and so it’ll fall to some other optimisation process, like ‘whatever term most people feel like they can easily understand’ which is not optimising for the right thing.
(Has Paul picked something that I just didn’t know and can start using?)
“Iterated distillation and amplification” is probably my preferred name at the moment. What do you think makes it too general? That is, what do you think is covered by that name but shouldn’t be?
(I think the name came from some combination of Ajeya and Daniel Dewey.)
I had also heard that term. When I heard it, it came with the tag ‘sufficiently general as to apply to what AlphaGo Zero did’ (I think AGZ, maybe a different AlphaGo) and I thought that meant it was too non-specific to apply to a potential path forward on alignment.
If the idea does have significant overlap with current systems (which I believe it does), it might be better to have a name that applies specifically to whichever part of the proposal is new / is different to what is already happening currently in capabilities research.
Yeah, I think Ben captures my objection—IDA captures what is different between your approach and MIRI’s agenda, but not what is different between some existing AI systems and your approach.
This might not be a bad thing—perhaps you want to choose a name that is evocative of existing approaches to stress that your approach is the natural next step for AI development, for example.
Currently feeling happiness at this comment, but also sadness at the fact that Paul hasn’t already picked something he finds fitting and so it’ll fall to some other optimisation process, like ‘whatever term most people feel like they can easily understand’ which is not optimising for the right thing.
(Has Paul picked something that I just didn’t know and can start using?)
I remember hearing people call it iterative distillation and amplification (IDA), but I think this name might be too general.
“Iterated distillation and amplification” is probably my preferred name at the moment. What do you think makes it too general? That is, what do you think is covered by that name but shouldn’t be?
(I think the name came from some combination of Ajeya and Daniel Dewey.)
I had also heard that term. When I heard it, it came with the tag ‘sufficiently general as to apply to what AlphaGo Zero did’ (I think AGZ, maybe a different AlphaGo) and I thought that meant it was too non-specific to apply to a potential path forward on alignment.
If the idea does have significant overlap with current systems (which I believe it does), it might be better to have a name that applies specifically to whichever part of the proposal is new / is different to what is already happening currently in capabilities research.
Yeah, I think Ben captures my objection—IDA captures what is different between your approach and MIRI’s agenda, but not what is different between some existing AI systems and your approach.
This might not be a bad thing—perhaps you want to choose a name that is evocative of existing approaches to stress that your approach is the natural next step for AI development, for example.