I think this is a valid and pretty big concern, especially down the line.
Let’s say that in 1 year Anthropic will have a model capable of running a tech startup almost entirely autonomously (maybe it needs 1 good CEO to set direction and that’s it). Everyone else in the public has significantly less capable models, perhaps because Anthropic is in the lead or their competitors also can’t release their SOTA models for safety reasons.
What’s stopping Anthropic from turning themselves into a startup accelerator in that situation and just hire founders and run dozens of AI-powered startups across every sector? Startups that sign with them will have a massive efficiency advantage compared to everyone else and Anthropic can thereby demand an extremely high amount of equity in return. If the AI model gap is large enough these startups will be successful and thereby let Anthropic take over a lot of different markets.
Reading this article, I get the feeling that a lot of the task misalignment issues highlighted here with AI, such as wishful thinking and downplaying and hiding mistakes, are also very common for humans within a larger organization. There’s presumably similar root causes to both: appearing more competent at your task than you actually are (and successfully fooling the grader/interviewer) is good for being selected for hiring/promotion in the human society but also good for getting the behavior reinforced if you’re an AI undergoing training.
If AI’s level of task misalignment is similar to humans, perhaps the AI task misalignment is actually easier to deal with because you can issue interventions to AI to solve the problem much more cheaply than solving the equivalent problem in a human organization. In other words it doesn’t stop you from getting superhuman performance delegating to AI compared to humans.