Tell me if this is an accurate description of your reasoning:
I thought it was not feasible to mix corrigibility with value alignment—we should aim for CAST instead.
I saw how Claude’s Constitution tries to mix corrigibility with values.
I don’t necessarily think the constitution is doing a good job at that, but it made me realize that I was too hasty to rule out the feasibility of mixing corrigibility with values.
Tell me if this is an accurate description of your reasoning:
Yes.