Regarding your Twitter comment about Musk’s proposals:
Here’s why I think there’s now a one-in-six chance of an imminent global #NuclearWar, and why I appreciate @elonmusk and others urging de-escalation, which is IMHO in the national security interest of all nations
The real issue with backing down from nuclear threats is what happens when you back down.
Let’s say we force Ukraine to allow Putin to keep the annexed territory because of nuclear weapons. This gives him, every Russian and every dictator around the world a clear message: nuclear weapons are the winning strategy.
It would make Putin and all warmongers like Prigozhin or Kadyrov look like geniuses. They stood against the whole world and won! Everyone inside Russia who was opposing the use of nuclear weapons would have to admit that it worked. So they need to use this trick more! It costs nothing. You just need to be a true believer in the greatness and ruthlessness of mother Russia.
Hitler also looked like genius of strategy after annexation of Austria and Sudetenland. From German perspective in Summer of 1939 he obviously knew what he was doing and should be trusted.
So, how far are you willing to back down?
Is Poland valuable enough for you? Poland was not valuable enough for UK and France in 1939 despite them pledging “all support in their power”. They got war anyway gaining only a bit of time.
And to illustrate that Russia’s imperial goals are not limited to Ukraine you have to remember that Russia’s demands in the first place were not limited to Ukraine.
From Article 4 of the draft of the “Agreement on measures to ensure the security of The Russian Federation and member States of the North Atlantic Treaty Organization”:
The Russian Federation and all the Parties that were member States of the North Atlantic Treaty Organization as of 27 May 1997, respectively, shall not deploy military forces and weaponry on the territory of any of the other States in Europe in addition to the forces stationed on that territory as of 27 May 1997.
To Poles and Balts this was unacceptable under any circumstances. We would never sign it.
From my perspective if nuclear threat wins Putin anything significant then Poland will need to try to obtain strategic nuclear weapons as soon as possible and at any cost. We lost every fifth person after 1939 because we trusted foreign powers when we should not have. Russian soldiers did not leave Poland until 1993 and we do not want them back.
The way out of nuclear escalation is convincing Russia that they have nothing to gain and everything to lose after using nuclear weapons. They need to back down from their mistake.
I agree with you on tasks where there is not a lot of headroom. But on tasks like International Olympiad level mathematics and programming 4x reduction in model size keeping performance constant will be small. I expect many 1000x and bigger improvements vs. what scaling laws would predict currently.
For example, on MATH dataset “(...) models would need around 10^35 parameters to achieve 40% accuracy” where 40% accuracy is achieved by a PhD student and International Olympiad participant will get close to 90%. https://arxiv.org/abs/2103.03874
With 100 trillion models (10^14) we would still be short by 10^21 parameters. So we will need to get some 20 orders of magnitude improvements in model size for the same performance from somewhere else.
Worth noticing the 40% vs. 90% gap for expert humans on MATH. And similar gap on MMLU (Massive Multitask Language Understanding) 35% for average human vs. 90% experts. Experts don’t have orders of magnitude bigger brains, different architecture or learning algorithm in their brains.
When replying, I also noticed that I made assumptions about what mean by x factor quality improvement. I’m not sure I understood correctly. Could you clarify what you meant precisely?