“Hyperstition” has the benefit of grouping “self-fulfilling {misalignment, alignment, …}” together though. It’s plausible I’m totally wrong here, but I think “self-fulfilling misalignment” being the commonly used term lends itself to people thinking of the “answer” being data filtering of alignment discourse, while I want people to think about things like upsampling positive data even more (which “self-fulfilling alignment” does get at, but I hear the term much less and I think people will end up just indexing on the name they hear most).
I agree hyperstition isn’t the best name for the reasons you describe to be clear, I just think “self-fulfilling {misalignment, …}” also has a problem.
Yeah, that’s fair. I guess my views on this are stronger because I think data filtering might be potentially negative rather than potentially sub-optimal.