Roman Malov comments on Tim Hua’s Shortform

Roman Malov 20 Dec 2025 11:07 UTC
3 points
2
The Case Against AI Control Research seems related. TL;DR: mainline scenario is that hallucination machine is overconfident about it’s own alignment solution, then it gets implemented without much checking, then doom.