Manypeople expect 2025 to be the Year of Agentic AI. I expect that too—at least in the sense of it being a big topic. But I expect people to eventually be disappointed. Not because the AI is failing to make progress, but for more subtle reasons. These agents will not be aligned well—because, remember, alignment is an unsolved problem. People will not trust them enough.
I’m not sure how the dynamic will unfold. There is trust in the practical abilities. Right now, it is low, but that will only go up. There is trust in the agent itself: Will it do what I want, or will it have or develop goals of its own? Can the user trust that its agents are not influenced in all kinds of ways by all kinds of parties—the labs, the platforms, the websites the agents interact with, attackers, or the governments? And it is not clear which aspect of trust will dominate at which stage. Maybe the biggest problems will manifest only after 2025, but we should see some of this this year.
This is an interesting prediction, and if it is true that agents are blocked primarily because of alignment issues, this would both be an update that AI alignment is harder than I think, but also that there are stronger incentives to solve the alignment problem than I think, as well.
Yes. And either way it would be a decent outcome. Unless people come to the wrong conclusions about what the problem is, e.g. “it’s the companies fault.”
There is trust in the practical abilities. Right now it is low, but that will only go up.
Part of the learning curve for using existing AI is calibrating trust and verifying answers, conditional on use case. A hallmark of inexperienced AI users is taking its replies at face value, without checking.
I do expect that over time, AI will become more trustworthy for daily users. But that is compatible with the trust users place in it decreasing as they familiarize themselves with the technology and learn its limitations.
Many people expect 2025 to be the Year of Agentic AI. I expect that too—at least in the sense of it being a big topic. But I expect people to eventually be disappointed. Not because the AI is failing to make progress, but for more subtle reasons. These agents will not be aligned well—because, remember, alignment is an unsolved problem. People will not trust them enough.
I’m not sure how the dynamic will unfold. There is trust in the practical abilities. Right now, it is low, but that will only go up. There is trust in the agent itself: Will it do what I want, or will it have or develop goals of its own? Can the user trust that its agents are not influenced in all kinds of ways by all kinds of parties—the labs, the platforms, the websites the agents interact with, attackers, or the governments? And it is not clear which aspect of trust will dominate at which stage. Maybe the biggest problems will manifest only after 2025, but we should see some of this this year.
This is an interesting prediction, and if it is true that agents are blocked primarily because of alignment issues, this would both be an update that AI alignment is harder than I think, but also that there are stronger incentives to solve the alignment problem than I think, as well.
Yes. And either way it would be a decent outcome. Unless people come to the wrong conclusions about what the problem is, e.g. “it’s the companies fault.”
Part of the learning curve for using existing AI is calibrating trust and verifying answers, conditional on use case. A hallmark of inexperienced AI users is taking its replies at face value, without checking.
I do expect that over time, AI will become more trustworthy for daily users. But that is compatible with the trust users place in it decreasing as they familiarize themselves with the technology and learn its limitations.