Kevin Zhang

Karma: 0

Kevin Zhang 30 Mar 2026 7:07 UTC
1 point
0
in reply to: Nikola Jurkovic’s comment on: nikola’s Shortform
Great idea. I help run the AI alignment club at UCSD, I’ll try to organize a group screening and discussion afterward.

Kevin Zhang 30 Mar 2026 6:50 UTC
1 point
0
in reply to: Thomas Larsen’s comment on: Thomas Larsen’s Shortform
It seems to me that the part of training most responsible for capabilities would be pre-training rather than RL (something like GRPO requires the base model to get at least one rollout correct). But also, it feels like most RL training has to be objective agnostic; a coding task wouldn’t clearly have a clear connection to alignment. If our goal is to train an aligned AI where capabilities and alignment goes hand in hand, it seems like we should somehow bake alignment training into pre-training rather than rely on post-training techniques. Unless, its primarily RL that induces long horizon goal directed capability (I suspect it’s some of both).