Towards_Keeperhood comments on Anti-Slop Interventions?

Towards_Keeperhood 6 Feb 2025 18:17 UTC
2 points
0
I don’t think that. See the bottom part of the comment you’re replying to. (The part after “Here’s what I would say instead:”)
Sry my comment was sloppy.
Right, my point is, I don’t see any difference between “AIs that produce slop” and “weak AIs” (a.k.a. “dumb AIs”).
(I agree the way I used sloppy in my comment mostly meant “weak”. But some other thoughts:)
So I think there are some dimensions of intelligence which are more important for solving alignment than for creating ASI. If you read planecrash, WIS and rationality training seem to me more important in that way than INT.
I don’t really have much hope for DL-like systems solving alignment but a similar case might be if an early transformative AI recognizes and says “no I cannot solve the alignment problem. the way my intelligence is shaped is not well suited to avoiding value drift. we should stop scaling and take more time where I work with very smart people like Eliezer etc for some years to solve alignment”. And depending on the intelligence profile of the AI it might be more or less likely that this will happen (currently seems quite unlikely).
But overall those “better” intelligence dimensions still seem to me too central for AI capabilities, so I wouldn’t publish stuff.
(Btw the way I read John’s post was more like “fake alignment proposals are a main failure mode” rather than also ”… and therefore we should work on making AIs more rational/sane whatever”. So given that I maybe would defend John’s framing, but not sure.)