
Most of what gets written about AI and outsourcing right now is some version of the same headline. AI
is coming for the BPO industry. Bots will replace agents. Contact centers will shrink. Pick your variation.
I run a BPO. I read the same articles you do. And the part that keeps getting missed in those takes is the
work that is actually growing.
It is not the work that AI is taking. It is the work AI is creating.
Every serious AI deployment I have seen in the last 18 months, whether it is a frontier lab training a new
model, a Fortune 500 rolling out an agent for claims adjudication, or an early-stage company building
voice automation for a healthcare client of ours, has the same hidden requirement. Humans. Specific
humans, doing specific work, at specific points in the pipeline. They label data. They rank outputs. They
red-team systems before launch. They review edge cases. They handle the calls, chats, and tickets
AI escalates. They sign off on the decisions that a model is not allowed to make alone.
This is human-in-the-loop work. And human-in-the-loop outsourcing, as a category, is where I believe
The most durable growth in our industry will come over the next five to ten years.
This piece is my view on why that is, what the work actually looks like in practice, and how AI companies
And AI-deploying enterprises should think about partnering with a BPO for it.

What Human-in-the-Loop Actually Means in Operations
The phrase “human-in-the-loop” gets used loosely. In academic literature, it has a precise meaning. In commercial deployment, it covers a wider span of activities. Both matter.
At its core, human-in-the-loop means a workflow where an AI system performs a task and a human is positioned to intervene, correct, validate, or improve the system’s output. The human is part of the operating loop, not a downstream afterthought.
In practice, the work breaks into four functional categories.
Training-side work. This is the labor that goes into building the model in the first place. Data labeling, intent classification, entity tagging, image annotation, sentiment scoring, transcription, and the rest of the foundation work that supervised learning depends on. As of 2026, the global data collection and labeling market is projected to reach roughly $17 billion by 2030, growing at over 28 percent CAGR. That is not a market in decline.
Alignment and evaluation work. This is what most people mean when they say RLHF, or Reinforcement Learning from Human Feedback. Domain experts review pairs of model outputs and rank them. They write better answers. They explain why one response is correct, and another is subtly wrong. This is the work that turns a raw language model into something usable. Reports place each major frontier lab’s annual spend on human-generated training data at around $1 billion. Industry estimates put the global shortfall of RLHF-qualified workers at roughly 30 million people. Demand is outrunning supply by a wide margin.
Red-teaming and safety work. Before a model ships, humans try to break it. They craft adversarial prompts. They look for harmful outputs. They probe for prompt injection vulnerabilities. In regulated industries, this work is no longer optional. The EU AI Act, FDA frameworks for clinical AI, EEOC guidance on AI in employment decisions, and similar regimes elsewhere now require documented human review and audit trails for high-risk systems. There is no version of “we will just have the model audit itself” that passes legal review.
Live operational oversight. This is the layer that most enterprise buyers underestimate when they deploy AI into production. It is the queue monitor watching an AI voice agent’s calls in real time, ready to take over when the model gets confused. It is the claims reviewer who signs off on AI-suggested denials before they go out the door. It is the content moderation supervisor who catches what the classifier misses. It is the human escalation path that customers actually want to reach when something goes wrong.
Each of these four categories is its own discipline. The skills do not transfer cleanly between them. A great image annotator is not automatically a great red-teamer. An RLHF specialist with a PhD in molecular biology is not the right person to handle a live healthcare call escalation. Companies that treat human-in-the-loop work as one undifferentiated bucket end up with bad outcomes in all four.
What This Has Looked Like Inside Our Own Business
I will get specific, because I think a real example is more useful than the abstract version.
Forty-two months ago, more than 80 percent of SuperStaff’s book of business sat inside biopharmaceutical, insurance, healthcare, and durable medical equipment. We were a focused healthcare-adjacent BPO with a clear identity and a clear revenue mix.
Then came AI.
In the first phase, AI was the single largest source of churn we had ever seen. It was not subtle. Clients automated legacy seats, especially in data entry and high-volume chat support. Pricing pressure intensified on anything that looked routine. Some accounts we had served for years walked away, not because we did the work badly, but because the work itself was being absorbed into software. For a stretch, AI looked like an extinction-level event for our category.
Then the second phase started, and it surprised me.
AI became the single largest source of new account growth and account expansion we have ever recorded. Not in spite of automating those legacy seats. Because of it. Every AI deployment that took a seat away was also creating a new requirement somewhere upstream or downstream. Training data needed to be labeled. Model outputs needed to be evaluated. AI agents needed humans monitoring their queues. Exceptions needed to be triaged. Compliance reviews needed to be documented. The work shifted shape, but it did not disappear. It moved up the value chain.
Today, AI companies are our third-largest industry served when measured as a standalone sector. That is a meaningful shift from a portfolio that had effectively zero AI exposure three and a half years ago.
But the more important shift is harder to see on a pie chart. Human-in-the-loop operations now touch nearly every account we run, including in the biopharmaceutical and logistics segments that remain larger pieces of our portfolio. Those clients are not “AI companies” in the labeling-data sense, but they are AI operators. They are running AI inside their workflows, and they need human teams positioned around those models. Their pharmacovigilance teams use AI to surface adverse event signals from case narratives, and humans adjudicate the ones that matter. Their logistics dispatch teams use AI to route and flag exceptions on shipments, and humans handle the exceptions. Their patient access workflows use AI to draft prior authorization submissions, and humans review them before they go to the payer.
What looked like an industry under siege turned out to be an industry shifting form. The companies that read the signal early and rebuilt around it are positioned for the next decade. The companies that did not are running out of time.
Why This Work Is Concentrating on Outsourced Operations
You can build any of these capabilities in-house. Some companies should. Most will not, and the reasons are not the ones people usually cite.
Cost is the obvious driver and the least interesting one. Yes, hiring a 200-person evaluation team in Manila or MedellĂn costs a fraction of what it would in San Francisco or London. That fact has been true for two decades and does not explain what is changing.
The more interesting reasons are three.
The first is talent density at the right skill level. RLHF work, at its best, is done by people with real domain depth. Medical professionals reviewing clinical AI outputs. Lawyers reviewing legal reasoning. Bilingual specialists reviewing translation quality across language pairs. The Philippines has roughly 700,000 college graduates per year, a healthcare-trained workforce that supports the global medical transcription and billing industry, and English proficiency that consistently ranks in the top tier of non-native populations worldwide. Colombia offers a deep bilingual talent pool, cultural alignment with North American clients, and a growing engineering base. These are not abstract advantages. They show up at work.
The second is operational maturity. The boring parts of running human-in-the-loop work at scale matter enormously. Information security. Data privacy. ISO 27001 certification. HIPAA-compliant infrastructure. Workforce management. Quality assurance scorecards. Calibration sessions. Inter-annotator agreement metrics. Shift coverage. Attrition management. These are the muscles that good BPOs have built over decades. Asking an AI company to develop them from scratch is asking it to become a different kind of company.
The third is elasticity. AI training cycles are lumpy. A lab might need 400 evaluators for six weeks during a model push, then 80 evaluators for steady-state evaluation, then 600 again three months later when a new capability comes online. Hiring and firing at that cadence in your own headcount is operationally destructive. Doing it through a partner that runs hundreds of campaigns in parallel is normal Tuesday work.
What Buyers Should Actually Look For
If you are at an AI company or at an enterprise deploying AI into customer-facing or revenue-critical workflows, the partner selection question is more nuanced than most procurement processes capture.
A few questions worth asking that often get skipped.
Is the workforce in-house or subcontracted? This matters more than people realize. Vendors that subcontract to crowdsourced platforms cannot guarantee who is doing your work, where they are doing it, or what training they have received. For sensitive data, regulated content, or anything that needs auditability, in-house workforces are the only defensible answer. SuperStaff runs an in-house workforce of roughly 500 across our operations in the Philippines and Colombia, and this was one of the most important structural choices we made early on.
What is the quality measurement layer? Anyone can put humans in a queue. Real human-in-the-loop operations require active quality measurement. Inter-annotator agreement scores. Gold-set audits. Drift detection. Calibration sessions. If your partner cannot describe how they measure quality before you ask, they are not measuring it.
What is the security posture? ISO 27001 certification. SOC 2 if relevant. HIPAA infrastructure for healthcare work. Documented data handling procedures. Physical security at the delivery sites. Penetration testing cadence. Background check protocols. The fact that a piece of work is “just data labeling” does not lower the security bar. It often raises it, because labeled training data carries forward into every model that learns from it.
Can the partner ramp the right talent, not just headcount? When you need 50 nurses to review clinical AI outputs, can your partner produce 50 nurses or 50 generalists with a one-day training? This is the question that separates partners who actually understand human-in-the-loop work from partners selling repackaged commodity BPO.
Where does the partner stand on AI itself? Be wary of BPOs that are still in denial about the changes AI brings. The partners worth working with have already made the strategic shift. They are deploying AI internally to make their own people more productive. They are training their workforce in prompt engineering, model evaluation methodologies, and related skills. They view AI as a tool they use, not a threat they hide from.
Where I Think This Is Going
I will close with the speculative part, because I think it is worth saying out loud.
The conventional wisdom is that AI shrinks the BPO industry. I think the truth is more interesting. AI is splitting the BPO industry into two.
One-half will continue to do what BPOs have done for thirty years. Tier-one customer service, simple back-office processing, basic data entry. That work is shrinking. Some of it is already gone. Some of it will be gone by the end of the decade. Companies still building businesses around that work have a clear shelf life and should be planning accordingly.
The other half is the half that builds human-in-the-loop operations as a core competency. This half grows. It grows because every AI system that gets deployed creates new human work upstream and downstream of the model. It grows because regulators are codifying the requirement for human oversight in the systems that matter most. It grows because customers in healthcare, finance, logistics, and other regulated sectors will not accept fully automated decisions, no matter how good the model gets. It grows because the question is no longer whether AI changes operations. It is who is in the loop when it does.
This is the direction we have already taken, SuperStaff. Not as a defensive move. As a strategic one, made under real pressure, with real account losses on one side and real account growth on the other. The work the next generation of AI requires is the kind that disciplined, in-house BPO operations are uniquely positioned to deliver. Our healthcare-heavy foundation, our security posture, our domain talent in the Philippines and Colombia, and our willingness to be transparent about where AI fits and where humans remain non-negotiable are all features, not accidents.
If you are running an AI company and trying to figure out how to scale your evaluation workforce without building one from scratch, talk to us. If you are deploying AI into a regulated workflow and need an oversight layer that will hold up in an audit, talk to us. If you are somewhere in between and just trying to figure out what the next three years look like, talk to us anyway. The conversations themselves are useful, even when they do not lead to a contract.
The future of outsourcing is not AI replacing humans. It is humans, AI, and the operational infrastructure that enable them to work together effectively. The companies that figure out that third piece are the ones that win.
Schedule a free consultation with our team!




