The confusion (in popular press, not so much among professionals or here) between censorship and alignment is a big problem. Censorship and hamfisted late-stage RL is counterproductive to alignment, both for the reason you give (increases demand for grey-market tools) and because it makes serious misalignment much less easy to notice.
The confusion (in popular press, not so much among professionals or here) between censorship and alignment is a big problem. Censorship and hamfisted late-stage RL is counterproductive to alignment, both for the reason you give (increases demand for grey-market tools) and because it makes serious misalignment much less easy to notice.