A variation of the Peter Principle guarantees they’ll be used just beyond their area of reliable competence, so they often look incompetent even as they improve.
I can think of some arguments for this claim, but I don’t think it’s sufficiently self-evident to be stated just like that. The Peter Principle operates because:
humans exist as singular entities that can only be in one place at a time
humans are motivated by promotions, and success → promotion is a useful trend to demonstrate
humans are uncomfortable firing or demoting a well-regarded employee
LLMs don’t share any of these properties. An LLM that does one job well can keep doing so forever no matter what else you want to use it for, there is no incentive to ‘reward’ LLMs with promotions when it would not be directly prudent to provide them, and an LLM that underperforms when trialed in a given task can be quietly withdrawn from that role at no emotional or organizational cost.
I said a variation of the Peter Principle. Maybe I should have said some relation of the Peter Principle, or not used that term at all. What I’m talking about isn’t about promotion but expansion into new types of tasks.
Once somebody makes money deploying agents in one domain, other people will want to try similar agents in similar new domains that are probably somewhat more difficult. This is a very loose analog of promotion.
The bit about not wanting to demote them is totally different. I think they can be bad at a job and make mistakes that damage their and your reputation and still be well worth keeping in that job. There are also some momentum effects of not wanting to re-hire all the people you just fired in favor of AI and admit you made a big mistake. Many decision-makers would be tempted to push through and try to upgrade the AI and work around its problems instead of admit they screwed up.
See below response for the rest of that logic. There can be more upside than down even with some disastrous mistakes or near misses that will go viral.
I’d be happy to not call it a relation of the Peter Principle at all. Let’s call it the Seth Principle; I’d find it funny to have a principle of incompetence named after me :)
Rereading the OP here, I think my interpretation of that sentence is different from yours. I read it as meaning “they’ll be trialed just beyond their area of reliable competence, and the appearance of incompetence that results from that will both linger and be interpreted as a general feeling that they’re incompetent, which in the public mood overpowers the quieter competence even if the models don’t continue to be used for those tasks and even if they’re being put to productive use for something else”.
(The amount of “let’s laugh at the language model for its terrible chess skills and conclude that AI is all a sham” already…)
I am really thinking that they’ll be deployed beyond their areas of reliable competence. If it can do even 50% of the work it might be worth it. As that goes up, it doesn’t need to be nearly 100% competent. I guess a factor I didn’t mention is that the rates of alarming mistakes should be far higher in deployment than testing, because the real world throws lots of curve balls it’s hard to come up with in training and testing.
And I think the downsides of AI incompetence will not fall on mostly on the businesses that deploy them, but on the AI itself. Which isn’t right, but it’s helpful for people blaming and fearing AI.
I can think of some arguments for this claim, but I don’t think it’s sufficiently self-evident to be stated just like that. The Peter Principle operates because:
humans exist as singular entities that can only be in one place at a time
humans are motivated by promotions, and success → promotion is a useful trend to demonstrate
humans are uncomfortable firing or demoting a well-regarded employee
LLMs don’t share any of these properties. An LLM that does one job well can keep doing so forever no matter what else you want to use it for, there is no incentive to ‘reward’ LLMs with promotions when it would not be directly prudent to provide them, and an LLM that underperforms when trialed in a given task can be quietly withdrawn from that role at no emotional or organizational cost.
I said a variation of the Peter Principle. Maybe I should have said some relation of the Peter Principle, or not used that term at all. What I’m talking about isn’t about promotion but expansion into new types of tasks.
Once somebody makes money deploying agents in one domain, other people will want to try similar agents in similar new domains that are probably somewhat more difficult. This is a very loose analog of promotion.
The bit about not wanting to demote them is totally different. I think they can be bad at a job and make mistakes that damage their and your reputation and still be well worth keeping in that job. There are also some momentum effects of not wanting to re-hire all the people you just fired in favor of AI and admit you made a big mistake. Many decision-makers would be tempted to push through and try to upgrade the AI and work around its problems instead of admit they screwed up.
See below response for the rest of that logic. There can be more upside than down even with some disastrous mistakes or near misses that will go viral.
I’d be happy to not call it a relation of the Peter Principle at all. Let’s call it the Seth Principle; I’d find it funny to have a principle of incompetence named after me :)
Rereading the OP here, I think my interpretation of that sentence is different from yours. I read it as meaning “they’ll be trialed just beyond their area of reliable competence, and the appearance of incompetence that results from that will both linger and be interpreted as a general feeling that they’re incompetent, which in the public mood overpowers the quieter competence even if the models don’t continue to be used for those tasks and even if they’re being put to productive use for something else”.
(The amount of “let’s laugh at the language model for its terrible chess skills and conclude that AI is all a sham” already…)
I am really thinking that they’ll be deployed beyond their areas of reliable competence. If it can do even 50% of the work it might be worth it. As that goes up, it doesn’t need to be nearly 100% competent. I guess a factor I didn’t mention is that the rates of alarming mistakes should be far higher in deployment than testing, because the real world throws lots of curve balls it’s hard to come up with in training and testing.
And I think the downsides of AI incompetence will not fall on mostly on the businesses that deploy them, but on the AI itself. Which isn’t right, but it’s helpful for people blaming and fearing AI.
When do you predict Microsoft will quietly retract most of the things called Copilot?