The Economics of Replacing Call Center Workers With AIs

TLDR: Voice AIs aren’t that much cheaper in the year 2025

My friend runs a voice agent startup in Canada for walk-in clinics. The AI takes calls and uses tools to book appointments in the EMR (electronic medical record) system. In theory, this helps the clinic hire less front desk staff and the startup makes infinite money. In reality, the margins are brutal and they barely charge above cost. This is surprising to me: surely a living, breathing, squishy human costs more per hour than a GPU in a datacenter somewhere?

An industry overview of voice AIs

Broadly speaking there are 3 types of companies in the voice AI industry

  1. Foundation model companies:

    1. These companies actually train the text to speech and realtime audio models

    2. Openai, Elevenlabs, Cartesia

  2. Pipeline companies

    1. Infrastructure companies that aggregate multiple foundation model providers and help you experiment with multiple providers, build agents, and connect with SIP and WebRTC transports (think OpenRouter but with extra steps).

    2. Developer focused: N8n, Bland, Vapi

    3. Enterprise focused: Ada, Sierra, Fin

  3. Vertical startups

    1. Startups that do “voice agents for {healthcare | logistics | real estate | etc }”

    2. Here’s 142 of them

Of course, these categories are fuzzy and some companies might vertically integrate over many layers (e.g. Vapi has its own foundation model for TTS).

The line by line breakdown

Let’s dive into the heart of the stack, using Vapi as an example

Vapi works like a sandwich with a few flavors

Speech to Text (STT) ⇒ LLM ⇒ Text to Speech (TTS)

  • First, deepgram converts calls to text (100ms)

  • Then, gpt 4o does text to text (600ms)

  • Finally, Vapi does text to speech (250 ms)

  • Add in some latency sauce from WebRTC transport (100ms) or Twilio phone service (600 ms)

  • At a minimum this costs $0.15/​minute

    • $0.05 for Vapi hosting

    • $0.01 for Deepgram Speech to Text

    • $0.07 for GPT 4o

    • $0.022 for Vapi Text to Speech

Realtime API

  • OpenAI handles direct audio to audio conversion but you pay $0.91/​minute

    • Caveat: I actually tried making a call and was charged $0.53/​minute for some reason, so I used that number instead.

They have a calculator here that’s fun to play with.

Comparison to Humans and Business Process Outsourcing (BPO)

Here are some top destinations US companies offshore to and their respective call center salaries, along with the hourly rates of Vapi TTS, Vapi OpenAI Realtime, and Bland.

CountryAvg annual (local)Avg hourly (local)Approx annual (USD)Approx hourly (USD) Source
EgyptEGP 128,478EGP 62/​hr$2,716$1.31ERIERI /​ SalaryExpert. (ERI Economic Research Institute)
Vietnam₫83,603,022₫40,194/​hr$3,174$1.53SalaryExpert /​ related. (Salary Expert)
Philippines₱264,272₱127/​hr$4,487$2.16SalaryExpert (ERI). (Salary Expert)
India₹429,359₹206.42/​hr$4,809$2.31SalaryExpert (ERI). (Salary Expert)
MexicoMXN 148,016MXN 71/​hr$7,670$3.68SalaryExpert (ERI). (Salary Expert)
ColombiaCOP 30,441,760COP 14,635/​hr$8,061$3.88SalaryExpert (ERI). (Salary Expert)
BrazilR$44,967R$22/​hr$8,319$4.07ERIERI /​ salary sites. (ERI Economic Research Institute)
Bland Voice Agent--$11,232.00$5.40https://​​docs.bland.ai/​​platform/​​billing
South AfricaR198,779R96/​hr$11,487$5.55ERIERI /​ SalaryExpert. (ERI Economic Research Institute)
RomaniaRON 54,416RON 26/​hr$12,363$5.91SalaryExpert. (Salary Expert)
PolandPLN 61,205≈PLN 29.4/​hr$16,684$8.02TTEC /​ Salary writeups. (TTEC Jobs)
Vapi TTS--$18,720.00$9.00https://​​vapi.ai/​​pricing
Canada CAD 3550016.83$25,186.01$11.95my friend
US--$38,854.40$18.68Indeed
Vapi OpenAI Realtime audio--$67,392.00$32.40https://​​vapi.ai/​​pricing

We can see that Bland’s $0.09/​minute ($5.4 USD/​hour) rate is competitive with South Africa, but it’s still cheaper to hire humans in most developing countries.

If one were to start a voice agent startup in Canada built on Vapi, they would pay $9/​hour in just API costs, while replacing a minimum wage worker that was paid $12/​hour. Add in the costs of onboarding, overhead, and salaries and you would be lucky to break even.

Assumptions

The human is working at 100% utilization every hour they are paid (maybe unrealistic but cynically maybe not?).

The onboarding and training costs of humans and setting up voice agent infrastructure and workflows is the same (likely voice agents are much cheaper but idk).

Minimum wage front desk receptionists make around the same as call center workers and do the same kinds of tasks. This might not be totally true, e.g. receptionists also interact with people in person/​show them around.

Limitations

Enterprise voice API contracts might offer bulk discounts for usage and multi-year lock in. I have no data on how this works because most enterprise pricing tends to be bespoke and private.

I mostly tested Vapi because Bland had a bunch of bugs and didn’t work. I also didn’t test enterprise platforms like Sierra or Ada because I’m not an enterprise.

I didn’t consider what the cheapest possible bespoke solution would be if you just went directly with foundation models/​self hosted open source + Twilio. This could be an interesting area for future research.

I didn’t consider the opportunity costs of having AIs take calls. Would the customer service/​receptionist people be replaced altogether, or be able to help with more administrative back office tasks? (assuming those aren’t also replaced by AIs).

Someone should do a study on price elasticity of demand in call centers/​receptionists. If we reduce the hourly rate by $1, how many more units of customer service would companies buy?

Presumably a large proportion of voice agents will be used for outbound sales in the future, increasing revenue instead of reducing cost centers like customer service.

I didn’t consider new voice model architectures like Cartesia or Boson AI.

The Future

Shrewd capitalists would realize GPU/​inference costs are massively decreasing every year, and perhaps do a discounted cash flow model of saved costs for the next decade as voice models beat every human on earth in cost/​hour.

Assuming a drop in inference costs of 30% per year and the wages of call centers increase with each country’s inflation rate, we see most voice agents are competitive to the world’s cheapest human labor around 2030.

CountryInflation20252026202720282029203020312032
Egypt1.10$1.31$1.44$1.59$1.74$1.92$2.11$2.32$2.55
Vietnam1.03$1.53$1.58$1.62$1.67$1.72$1.77$1.83$1.88
Philippines1.02$2.16$2.20$2.25$2.29$2.34$2.38$2.43$2.48
India1.05$2.31$2.43$2.55$2.67$2.81$2.95$3.10$3.25
Mexico1.04$3.68$3.83$3.98$4.14$4.31$4.48$4.66$4.84
Colombia1.05$3.88$4.07$4.28$4.49$4.72$4.95$5.20$5.46
Brazil1.09$4.07$4.44$4.84$5.27$5.75$6.26$6.83$7.44
Bland Voice Agent0.70$5.40$3.78$2.65$1.85$1.30$0.91$0.64$0.44
South Africa1.04$5.55$5.77$6.00$6.24$6.49$6.75$7.02$7.30
Romania1.10$5.91$6.50$7.15$7.87$8.65$9.52$10.47$11.52
Poland1.02$8.02$8.18$8.34$8.51$8.68$8.85$9.03$9.21
Vapi TTS0.70$9.00$6.30$4.41$3.09$2.16$1.51$1.06$0.74
Canada 1.02$11.95$12.19$12.43$12.68$12.94$13.19$13.46$13.73
US1.02$18.68$19.05$19.43$19.82$20.22$20.62$21.04$21.46
Vapi OpenAI Realtime Audio0.70$32.40$22.68$15.88$11.11$7.78$5.45$3.81$2.67

Conclusion

Should you start a voice agent company in 2025? Probably, if you find the right industry and raise enough VC money to stay alive for 5 years. Should we let the AIs handle all customer service inquiries, sensitive personal information, and make tool calls to Electronic Medical Record systems? That’s a question for another article :)