My name is Mikhail Samin (diminutive Misha, @Mihonarium on Twitter, @misha in Telegram).
I’m an effective altruist, I worry about the future of humanity and want the universe not to lose most of its value.
I took the Giving What We Can pledge to donate at least 10% of my income for the rest of my life or until the day I retire (why?).
It seems that I have good intuitions about the AI alignment problem; some full-time alignment researchers told me that they were able to improve their understanding of the problem after talking to me.
I’m currently doing EA & AI Alignment outreach (e.g., I’m organising a translation of the 80,000 Hours’ Key Ideas series and partnering with Vert Dider for a translation and dubbing of Rob Miles’ videos) and considering switching to direct alignment research.
In the past, I’ve launched the most funded crowdfunding campaign in the history of Russia (it was to print HPMOR! we printed 21 000 copies, which is 63k books) and founded audd.io, which allowed me to donate >$50k to MIRI.
I don’t expect everyone to disregard the danger; I do expect most people building capable AI systems to continue to hide hard problems. Hiding the hard problems is much easier than solving them, but I guess produces plausible-sounding solutions just as well.
Roughly human level humans don’t contribute significantly to AI alignment research and can’t be pivotally used. So I don’t think you think that a roughly human level AI system can contribute significantly to AI alignment research. Maybe you (as many seem to) think that if someone runs not-that-superhuman language models with clever prompt engendering, fine-tuning, and systems around, than the whole system can solve alignment or be pivotally used, and the point of the post is that the whole system is superhuman, not roughly human-level, if it’s capable enough to solve alignment or be pibotally used, and you need to direct the whole system somewhere, and unless you made the whole system optimize for something you actually want, it probably kills you before it solves alignment.