Key Takeaways
- AI pilots primarily fail because agents are treated like add-ons or standalone tools — instead of being natively embedded in the existing flow of work.
- To scale successfully, agents must be fully integrated into the systems where work is happening and equipped with role-based access to the necessary context from all enterprise data systems.
- Escaping “pilot purgatory” requires implementing a centralized governance and performance management framework — including clear metrics and audit trails — across all deployed agents, regardless of their underlying AI model.
Over the past year, I’ve had conversations with hundreds of CIOs, from regional banks to Fortune 100 enterprises. They all tell me the same thing: They are tired of demos. Yet another demo, another pilot, without getting business value.
They see the potential. They’ve invested in AI. But they can’t get to broad deployment and ROI. Instead they get stuck in “pilot purgatory.” The issue is widespread: According to an MIT study, 95% of enterprise generative AI pilots fail to deliver demonstrable ROI.
So what are the other 5% doing differently? What does it take to get from pilot to deployment at scale?
I run engineering, customer success, professional services, and our sales teams in India and ASEAN. I hear what customers ask for during the sales process; my teams help them successfully implement and expand agents across their organizations; and I make sure customer insights go straight back into improving our products. The role gives me a unique vantage point on the full agent deployment cycle, and from that position, I’ve learned a few things about why AI pilots stall – and how they can really take off.
If your sales team lives in Salesforce, your agents need to work there. If your engineering team lives in Slack, that’s where agents belong.
Srini Tallapragada, President and Chief Engineering & Customer Success Officer
Fundamentally, becoming an Agentic Enterprise isn’t about layering on AI tools, it’s about reimagining your work processes from the ground up.
Why Most Pilots Fail
The shift to agentic AI isn’t just technological — it’s also cultural, operational, and organizational. You can’t simply build an agent, ship it, and expect it to work at scale. These are the operational gaps where most enterprises get stuck:
Goals and Performance Metrics Are Unclear
Many people treat AI as a simple automation tool — an add-on to existing processes. They build an AI tool and ship it, yet they can’t assess if it’s delivering value because they didn’t define specific business outcomes at the start. Agents need performance metrics, testing frameworks, continuous monitoring, lifecycle management. You have to track their output, train them as requirements change, and define escalation paths when they encounter edge cases. If you can’t manage it and measure it, you can’t scale it.
Agents Aren’t Embedded Where the Work Happens
Agents must be part of the systems people are using to get work done. When employees have to stop what they’re doing, switch tools, and re-enter information, adoption drops fast.
Context Is Missing
A deeper problem with isolating agents from the flow of work is context: Large Language Models (LLMs) alone are not enough. Agents need detailed context from enterprise systems to return useful answers and take actions. When agents don’t have that context, people fall into what we call the “prompt doom loop” — rewriting prompts to supply context that the agent should already know. It’s a frustrating employee experience, resulting in low adoption and stalled pilots. AI that’s disconnected from enterprise systems never earns sustained use.
Governance Is an Afterthought
When we started Agentforce on help.salesforce.com, we tracked every message. That was doable at a few thousand conversations. Once it hits millions of conversations, human review becomes impossible. But an auditor still asks: “How do you know your agents are compliant? Show me the audit trail.”
Most pilots work in controlled environments, but at scale, legal shuts them down because there’s no framework for permissions, audit trails, or compliance. You need role-based permissions, approval workflows, and audit capabilities — just like you’d have for employees. If you can’t prove governance, you can’t get to production.
Infrastructure Deficiencies Limit Scale
You need infrastructure to manage agents at scale. Most enterprises build pilots that work in isolation, then realize they can’t scale without rebuilding everything. They have no way to test agent behavior before wider deployment, no monitoring systems to catch problems in production, and no frameworks for updating agents as business logic changes. The platform debt compounds until the pilot can’t move forward.
What We’ve Learned
A year ago, we deployed Agentforce across every part of our business, working through all of these potential blockers on our way to becoming our own “Customer Zero.” The results speak for themselves:
In support, Agentforce now handles the bulk of daily customer conversations (over 2 million to date). You can see this working in real-time on help.salesforce.com — we even publish exactly how many more conversations the help agent now handles compared to its human teammates. This unlocked capacity allows us to shift employees from offering reactive support to providing proactive service — helping customers get their own Agentforce agents up and running in just weeks, even days.
In engineering, AI agents handle routine tasks, code maintenance, proactive threat monitoring, and rapid incident response. This has led to 30% cycle time improvement, with agents detecting 91% of incidents within eight minutes and auto-remediating 87% of them in under 20 minutes. Now engineers have the time to focus on more strategic development: new features, even greater reliability, and improved quality.
In sales, our website and events were generating 250,000 leads per week, but we could only follow up on the top 25% of highly qualified leads. It was just too expensive to do more. Now our SDR agent handles personalized outreach and qualification, generating $60 million of annualized pipeline during initial rollout. We’re reaching customers we couldn’t before.
From those deployments, and from working with more than 12,000 Agentforce customers over the past year, here are a few key lessons:
Agents Only Work in the Flow of Work
If your sales team lives in Salesforce, your agents need to work there. If your engineering team lives in Slack, that’s where agents belong.
A truly integrated agent understands your business process: when a customer contacts support, it knows their purchase history, open cases, contract terms, and escalation paths.
Agents need context from your CRM, service platform, data warehouse, and collaboration tools — and they need to understand the relationships between these systems and the business logic that governs data flow, with specific role-based permissions that control access. That context can’t be assembled at prompt time — it has to be native in the place where work is happening.
Agents Need Performance Management
You need to define what each agent can do: what data it can access, what actions it can take, what requires human approval. You need audit trails showing what every agent did, when, and why. You need testing frameworks that validate agent behavior before deployment and monitoring systems to track performance after deployment.
Think of it as agent lifecycle management: clear goals and metrics, performance management, continuous training, and escalation paths when agents encounter edge cases.
Use One Model or Many, but Commit to Singular Governance
For most customers, the underlying AI model is just infrastructure. All they really want to know is that their SDR agent is closing more leads; their service agent is resolving issues faster.
Yet some — typically with advanced technical teams or specialized requirements — want choice: OpenAI for certain use cases, Anthropic for others, Google’s models for specific tasks, or specialized models optimized for cost and performance.
But every agent, regardless of which model powers it, has to operate within your enterprise’s governance framework.
Think of it like API discovery versus API security: knowing what’s available doesn’t mean every agent should have unrestricted access.
Srini Tallapragada, President and Chief Engineering & Customer Success Officer
When APIs came onto the scene, we learned that you need a central way to govern and monitor every API across the enterprise. Agents are no different — you need a command center to register, audit, and manage them, regardless of their source.
Salesforce now supports open standards like MCP (Model Context Protocol) so agents can discover tools and capabilities across systems, but MCP alone isn’t enough. You still need the trust layer that enforces permissions, audit requirements, and compliance across everything. Think of it like API discovery versus API security: knowing what’s available doesn’t mean every agent should have unrestricted access.
The Agentic Enterprise
The companies that win this next phase won’t be the ones experimenting with AI — they’ll be the ones running on it.
When CIOs ask me where to begin, I tell them: Pick one low-risk use case. Build one agent. Ship it. Then expand by connecting to more data, adding skills, and plugging into collaborative platforms
Starting with lower-risk internal use cases — employee tools, knowledge management, channel summaries — is a good way to build organizational confidence and get quick wins. Once you prove it works, move to customer-facing deployments. Invite their feedback. Iterate.
That’s how you escape pilot purgatory. That’s how you turn AI from potential into performance as an Agentic Enterprise.
Go deeper:
- Watch Salesforce-on-Salesforce: 5 Agentic Lessons
- Dive into Agentforce stories and lessons learned after more than a million conversations and the one-year mark






