cubefox comments on Where do you get your capabilities from?

cubefox 3 May 2025 18:19 UTC
2 points
0
Note that the above comment by “anonymous” anticipated RLVR (reinforcement learning from verifiable rewards) as it is used by reasoning LLMs like o1.