No, let’s keep in mind the Aegis fire control for missile defense example.
This is a highly variable situation, the “enemy action” can come in many forms, from multiple directions at once, the weather can change rapidly, the fleet to defend might have a variety of compositions and spatial distributions, and so on. So one deals with a lot of variable and unpredictable factors. Yet, they were able to formally establish some properties of that software, presumably to satisfaction of their DoD customers.
It does not mean that they have a full-proof system, but the reliability is likely much better because of that effort at formal verification of software.
With Windows, who knows. Perhaps it is even more complex than that. But formal methods are often able to account for a wide range of external situations and data. For a number of reasons, they nevertheless don’t provide full guarantee (there is this trap of thinking, “formally verified ⇒ absolutely safe”, it’s important not to get caught into that trap; “formally verified” just means “much more reliable in practice”).
I was trying to address a general point of whether a provably correct software is possible (obviously yes, since it is actually practiced occasionally for some mission-critical systems). I don’t know if it makes sense to have that in the context of Windows kernels. From what people recently seem to say about Windows is that Microsoft is saying that the European regulator forced them not to refuse CrowdStrike-like updates (so much for thinking, “what could or should be done in a sane world”).
On Llama-3.1-405B pretraining costs:
Even a bit less than that, the high eight figure range, under 100 million dollars.
The model card https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md says that total for all 3 models, 8B, 70B, and 405B is 39.3 million GPU-hours, but for the 405B model alone it is 30.84 million GPU-hours for H100 GPU, that’s clearly less than 100 million dollars.
Interestingly, the GPU utilization seems low, slightly over 25%, if one believes this “on a napkin” computation I asked GPT-4o to assist me with: https://chatgpt.com/share/470b8e20-d99f-48b2-be5e-61d7524892df
It estimates that at full utilization, the 3.8x10^25 FLOPS can be extracted from H100 at approximately 10.6 million hours (which is slightly more than a quarter of the total 39.3 million hours for all three models).