I think corrigibility winning is by-default an S-risk.
Power appears historically to make people sadistic (consider Robespierre if you think this couldn’t happen to Dario, and I’d much rather risk him than the other guys), and regimes are often brutal and cruel far in excess of what would be rational by non-sadistic goals. And future technology will allow for forms of suffering much much worse and prolonged than current torture does, and without seeming as “messy” or unpleasant to external observers too. Currently, death is an easy way to ensure someone is no longer a threat, but I worry that at the power-levels in question, it may prove to be boring or unsatisfying.
Of course, it also remains to be seen whether this is a pattern that LLMs may imitate. I think “moral AI” failures likely just result in extinction, but wanted to point out that the risk is still present.
Status in general is like this. I used to think it was a “silly game for normies”, but I’ve since realized that it’s a necessary part of how we’ve evolved to coordinate with each other. More on this here.