AI Futures Project think that 4.1 is a smaller model than 4o. They suspect that this is the reason that o3-preview (elicited out of 4o) was better than the o3 which got released (elicited out of 4.1). Overall I think this makes much more sense than them being the same base model and then o3-preview being nerfed for no reason.
Perhaps 4.1 was the mini version of the training run which became 4.5, or perhaps it was just an architectural experiment (OpenAI is probably running some experiments at 4.1-size).
My mainline guess continues to be that GPT-5 is a new, approximately o3-sized model with some modifications (depth/width, sparsity, maybe some minor extra secret juice) which optimize the architecture for long reasoning compared to the early o-series models which were built on top of existing LLMs.
AI Futures Project think that 4.1 is a smaller model than 4o. They suspect that this is the reason that o3-preview (elicited out of 4o) was better than the o3 which got released (elicited out of 4.1). Overall I think this makes much more sense than them being the same base model and then o3-preview being nerfed for no reason.
Perhaps 4.1 was the mini version of the training run which became 4.5, or perhaps it was just an architectural experiment (OpenAI is probably running some experiments at 4.1-size).
My mainline guess continues to be that GPT-5 is a new, approximately o3-sized model with some modifications (depth/width, sparsity, maybe some minor extra secret juice) which optimize the architecture for long reasoning compared to the early o-series models which were built on top of existing LLMs.