Seth Herd comments on Veedrac’s Shortform

Seth Herd 2 May 2025 22:54 UTC
2 points
0
Mostly agreed. When suggesting even differential acceleration I should remember to put a big WE SHOULD SHUT IT ALL DOWN just to make sure it’s not taken out of context. And as I said there, I’m far from certain that even that differential acceleration would be useful.
I agree that Kat Woods is overestimating how optimistic we should be based on LLMs following directions well. I think re-litigating who said what when and what they’d predict is a big mistake since it is both beside the point and tends to strengthen tribal rivalries—which are arguably the largest source of human mistakes. There is an interesting, subtle issue there which I’ve written about in The (partial) fallacy of dumb superintelligence and Goals selected from learned knowledge: an alternative to RL alignment. There are potential ways to leverage LLM’s relatively rich (but imperfect) understanding into AGI that follows someone’s instructions. Creating a “goal slot” based on linguistic instructions is possible. But it’s all pretty complex and uncertain.