It is a trend which I suspect to be linear/exponential with logarithm of model size (e.g. a counterfactual Anthropic training not 10T/2T/400B Mythos/Opus/Sonnet, but 2T/400B/100B Opus/Sonnet/Haiku and resorting to training an Opus only recently would see an one-time AECI gain unless it keeps training the Opuses) and with time (aka the logarithm of RL-spent compute? the amount of experience which the model had?) The model card of Opus 4.7 confirms that Mythos didn’t accelerate Opuses’ progress (however, if Opus 4.7 is distilled from Mythos, then Mythos did just keep the trend afloat. Maybe Opus 5 would fail to exhibit gains due to a lack of high-quality training data?)
FOOM! Is this really a jump, or just the exponential trend?
It is a trend which I suspect to be linear/exponential with logarithm of model size (e.g. a counterfactual Anthropic training not 10T/2T/400B Mythos/Opus/Sonnet, but 2T/400B/100B Opus/Sonnet/Haiku and resorting to training an Opus only recently would see an one-time AECI gain unless it keeps training the Opuses) and with time (aka the logarithm of RL-spent compute? the amount of experience which the model had?) The model card of Opus 4.7 confirms that Mythos didn’t accelerate Opuses’ progress (however, if Opus 4.7 is distilled from Mythos, then Mythos did just keep the trend afloat. Maybe Opus 5 would fail to exhibit gains due to a lack of high-quality training data?)