Zvi’s comments:
Rasool
This person has been keeping track of their dealing with the BBC wrt to their choice not to get a TV license since 2006 (I believe at some point their SEO was so good that searches for ‘bbc TV license’ brought up this person’s website, rather than any official source)
They document things like the technology claimed to be used, and the results of Freedom of Information requests
Other fun pages include bogus/contradictory numbers supplied by the BBC, people spotting vans in the wild, and why the BBC’s threatening letters request people to not ‘write below this line’
Anthropic’s Frontier Red Team recently wrote about how Opus 4.6, fairly autonomously, “found and validated more than 500 high-severity vulnerabilities” in open-source projects
The UK Health Secretary in 2021, Matt Hancock, ordered 100m vaccines, rather than 30m, because of the film Contagion
A few days ago, Amodei claimed that 90% of code at Anthropic (and some companies they work with) is being written by AI
https://www.themidasproject.com/watchtower tracks changes to AI company policies
Great piece! One minor detail is that Llama 4 curiously only used 32k H100s, rather than the 100k+ mooted
The Midas Project is a good place to keep track of AI company policy changes. Here is their note on the Anthropic change:
https://www.themidasproject.com/watchtower/anthropic-033125
Also picked up here
Also consider modafinil
Not withstanding
Stigler’s Law of Eponymy: “No scientific discovery is named after its original discoverer.”
:P
@Alex Lawsen has written about this too:
https://lawsen.substack.com/p/another-conversation-about-other
https://lawsen.substack.com/p/notebooklm-podcasts-but-good
https://lawsen.substack.com/p/getting-the-most-from-deep-research
https://lawsen.substack.com/p/my-current-writing-workflow
https://lawsen.substack.com/p/my-current-claude-projects
Ege Erdil 02:51:22
…
I think another important thing is just that AIs can be aligned. You get to control the preferences of your AI systems in a way that you don’t really get to control the preference of your workers. Your workers, you can just select, you don’t really have any other option. But for your AIs, you can fine tune them. You can build AI systems which have the kind of preferences that you want. And you can imagine that’s dramatically changing basic problems that determine the structure of human firms.
For example, the principal agent problem might go away. This is a problem where you as a worker have incentives that are either different from those of your manager, or those of the entire firm, or those of the shareholders of the firm.
It looks like this is a linkpost to:
Might Leopold Aschenbrenner also be involved? He runs an investment fund with money from Nat Friedman, Daniel Gross, and Patrick Collison, so the investment in Mechanize might have come from that?
Does this match your understanding?
AI Company Public/Preview Name Hypothesized Base Model Hypothesized Enhancement Notes OpenAI GPT-4o GPT-4o None (Baseline) The starting point, multimodal model. OpenAI o1 GPT-4o Reasoning First reasoning model iteration, built on the GPT-4o base. Analogous to Anthropic’s Sonnet 3.7 w/ Reasoning. OpenAI GPT-4.1 GPT-4.1 None An incremental upgrade to the base model beyond GPT-4o. OpenAI o3 GPT-4.1 Reasoning Price/cutoff suggest it uses the newer GPT-4.1 base, not GPT-4o + reasoning. OpenAI GPT-4.5 GPT-4.5 None A major base model upgrade OpenAI GPT-5 GPT-4.5 Reasoning ”GPT-5″ might be named this way, but technologically be GPT-4.5 + Reasoning. Anthropic Sonnet 3.5 Sonnet 3.5 None Existing model. Anthropic Sonnet 3.7 w/ Reasoning Sonnet 3.5 Reasoning Built on the older Sonnet 3.5 base, similar to how o1 was built on GPT-4o. Anthropic N/A (Internal) Newer Sonnet None Internal base model analogous to OpenAI’s GPT-4.1. Anthropic N/A (Internal) Newer Sonnet Reasoning Internal reasoning model analogous to OpenAI’s “o3”. Anthropic N/A (Internal) Larger Opus None Internal base model analogous to OpenAI’s GPT-4.5. Anthropic N/A (Internal) Larger Opus Reasoning Internal reasoning model analogous to hypothetical GPT-4.5 + Reasoning. Google N/A (Internal) Gemini 2.0 Pro None Plausible base model for Gemini 2.5 Pro according to the author. Google Gemini 2.5 Pro Gemini 2.0 Pro Reasoning Author speculates it’s likely Gemini 2.0 Pro + Reasoning, rather than being based on a GPT-4.5 scale model. Google N/A (Internal) Gemini 2.0 Ultra None Hypothesized very large internal base model. Might exist primarily for knowledge distillation (Gemma 3 insight).
What is this in reference to?