MichaelLowe comments on My AI Vibes are Shifting

MichaelLowe 7 Sep 2025 17:10 UTC
5 points
1
Thanks for publishing this!
My main disagreement is about a missing consideration: Shrinking time to get alignment right. Despite us finding out that frontier models are less misaligned by default than ^[1]most here would have predicted, the bigger problem to me is that we have made only barely progress about crossing the remaining alignment gap. As a concrete example: LLMs will in conversation display a great understanding and agreement with human values, but in agentic settings (Claude 4 system card examples of blackmail) act quite differently. More importantly on the research side: to my knowledge, there has neither been a recognized breakthrough nor generally recognized smooth progress towards actually getting values into LLMs.
Similarly, at least for me a top consideration that AFAICT is not in your list: the geopolitical move towards right-wing populism (particularly in the USA) seems to reduce the chances of sensible governance quite severely.

Less risk. AI is progressing fast, but there is still a huge amount of ground to cover. Median AGI timeline vibes seem to be moving backwards. This increases the chance of a substantial time for regulation while AI grows. It decreases the chance that AI will just be 50% of the economy before governance gets its shoes on.
This seems basically true to me if we are comparing against early 2025 vibes, but not against e.g. 2023 vibes (“I think vibes-wise I am a bit less worried about AI than I was a couple of years ago”). Hard to provide evidence for this, but I’d gesture at the relatively smooth progress between the release of ChatGPT and now, which I’d summarize as “AI is not hitting a wall, at the very most a little speedbump”.
Less risk. AI revenue seems more spread.
This is an interesting angle, and feels important. The baseline prior should imo be: governing more entities with near 100% effectiveness is harder than governing fewer. While I agree that conditional on having lots of companies it is likelier that some governance structure exists, it seems that the primary question is whether we get a close to zero miss rate for “deploying dangerous AGI”. And that seems much harder to do when you have 20 to 30 companies that are in a race dynamic, rather than 3. Having said that, I agree with your other point about AI infrastructure becoming really expensive and that the exact implications are poorly understood.
1. ^
  I think about two/thirds of this perceived effect are due to LLMs not having much goals at all rather than them having human compatible goals.
- StanislavKrym 7 Sep 2025 19:21 UTC
  1 point
  0
  Parent
  As for shrinking time to get alignment right, my worse-case scenario is that someone commits a breakthrough in AGI capabilities research and the breakthrough is algorithmic, not achieved by concentrating the resources, as the AI-2027 forecast assumes.
  However, even this case can provide a bit of hope. Recall that GPT-3 was trained by using just about 3e23 FLOP and ~300B tokens. If it was OpenBrain who trained a thousand of GPT-3-scaled models with the breakthrough by using different parts of training data, then they might even be able to run a Cannell-like experiment and determine models’ true goals, alignment or misalignment...