Your proposal seems to be generated with LLM assistance. @Raemon, is the approval correct?
It also might be too over-the-top abstract and failing to solve key problems in AI alignment. I think that one should come up with more concrete proposals like mine which have a chance to actually reward the AI for guiding a weaker AI to the solution instead of coercing the weaker AI into accepting the solution non-critically.
I’m just looking to conceptually unify the issues at hand within a coherent framework. I just read your linked article and it’s interesting! I also think it fits neatly within the paradigm I’m proposing.
I see two problems.
Your proposal seems to be generated with LLM assistance. @Raemon, is the approval correct?
It also might be too over-the-top abstract and failing to solve key problems in AI alignment. I think that one should come up with more concrete proposals like mine which have a chance to actually reward the AI for guiding a weaker AI to the solution instead of coercing the weaker AI into accepting the solution non-critically.
This of this as an alternative to CEV
I’m just looking to conceptually unify the issues at hand within a coherent framework. I just read your linked article and it’s interesting! I also think it fits neatly within the paradigm I’m proposing.
As to llm assistance, these are all my ideas.