Vladimir_Nesov comments on ryan_greenblatt’s Shortform

Vladimir_Nesov 1 Dec 2025 19:11 UTC
14 points
0

and assuming 4.5 Opus wasn’t a big scale up relative to prior models

It seems plausible that Opus 4.5 has much more RLVR than Opus 4 or Opus 4.1, catching up to Sonnet in RLVR-to-pretraining ratio (Gemini 3 Pro is probably the only other model in its weight class, with a similar amount of RLVR). If it’s a large model (many trillions of total params) that wouldn’t run decode/generation well on 8-chip Nvidia servers (with ~1 TB HBM per scale-up world), it could still be efficiently pretrained on 8-chip Nvidia servers (if overly large batch size isn’t a bottleneck), but couldn’t be RLVRed or served on them with any efficiency.

As we see with the API price drop, they likely have enough inference hardware now with large scale-up worlds (probably Trainium 2, possibly Trillium, though in principle GB200/GB300 NVL72 would also do), which wasn’t the case for Opus 4 and Opus 4.1. This hardware would also have enabled them to do efficient large scale RLVR training, which too they possibly weren’t able to do yet in the times of Opus 4 and Opus 4.1 (but there wouldn’t be an issue with Sonnet, which would fit in 8-chip Nvidia servers, so they mostly needed to apply its post-training process to the larger model).