Salesforce is not backing away from generative artificial intelligence. Instead, the company is refining how large language models (LLMs) are used inside its flagship AI product, Agentforce, as enterprise customers shift from early experimentation to large-scale, production deployments.
Over the past year, Salesforce chief executive Marc Benioff has repeatedly highlighted Agentforce as a transformative tool capable of automating business processes and reducing operational costs. Built on top of powerful LLMs, including models from OpenAI, Agentforce was designed to handle everything from customer service interactions to internal workflows.
As adoption has accelerated, Salesforce executives say a key lesson has emerged: while LLMs provide powerful raw intelligence, they cannot reliably run enterprise businesses on their own without structure, controls, and governance.
From AI Experiments to Production-Grade Systems
Across the technology sector, companies are moving beyond pilot programs and proofs of concept toward deploying AI in mission-critical roles. That transition has exposed limitations in relying solely on probabilistic models that interpret language and generate responses dynamically.
Salesforce’s response has been to strengthen the infrastructure around LLMs rather than reduce their role. The company has increasingly emphasized deterministic frameworks—rule-based logic, predefined workflows, and strict guardrails—that work alongside generative AI to ensure predictable outcomes.
This approach reflects a broader industry realization: enterprise AI must be grounded in accurate data, business logic, and governance if it is to operate safely and at scale.
Clarifying Comments on Trust in LLMs
Some discussion around Salesforce’s strategy has focused on comments made by Sanjna Parulekar, senior vice president of product marketing, regarding “trust” in LLMs. Salesforce says those remarks were made specifically in the context of customers transitioning from experimental use to production environments.
Rather than signaling a loss of confidence in generative AI, the comments underscored the need to augment LLMs with proprietary data, structured workflows, and oversight mechanisms. As AI agents take on more responsibility, the tolerance for unpredictability drops sharply.
Salesforce maintains that LLMs remain central to Agentforce’s capabilities, but that raw intelligence must be shaped into something enterprises can depend on day after day.
Guardrails as a Competitive Advantage
Agentforce was built to address exactly these challenges. Salesforce positions the platform as an AI orchestration layer that connects LLMs to trusted enterprise data, embeds them within business rules, and monitors their behavior in real time.
A Salesforce spokesperson emphasized that the company’s strategy is about optimization, not retreat.
“While LLMs are amazing, they can’t run your business by themselves,” the spokesperson said. “Companies need to connect AI to accurate data, business logic, and governance to turn the raw intelligence that LLMs provide into trusted, predictable outcomes.
“That’s why we built Agentforce: trusted AI infrastructure that drives real business value. We ground AI in tight guardrails and deterministic frameworks, optimizing LLMs to deliver enterprise-grade reliability. Trusted. Reliable. Secure. This is what AI is meant to be.”
Reliability and Cost at the Enterprise Level
One reason Salesforce is leaning into deterministic components is reliability. LLMs can occasionally hallucinate, skip steps, or lose track of objectives when conversations become complex. These risks grow more serious when AI agents handle sensitive functions such as billing, refunds, or operational reporting.
Cost is another factor. Running large-scale AI agents can be expensive, particularly when models are tasked with handling instructions better suited to simple logic. Salesforce says combining deterministic automation with generative AI helps reduce unnecessary processing while improving accuracy.
Agentforce pricing reflects this blended approach, with customers paying per conversation or through prepaid usage credits. Optimizing when and how LLMs are used helps customers manage both performance and cost as deployments scale.
Customer Feedback Shapes Product Evolution
Customer experiences have played a significant role in shaping Agentforce’s evolution. Companies using the platform have found that clearly defined, repeatable tasks often perform better when handled by deterministic triggers rather than open-ended AI reasoning.
Salesforce executives say even advanced customers encounter issues such as AI “drift,” where agents lose focus when users introduce unrelated questions. To address this, the company is testing features like Agentforce Script, which emphasizes structured workflows while still allowing generative AI to step in when interpretation is required.
The goal is not to limit AI’s capabilities, but to ensure it behaves consistently in production environments.
Salesforce’s Own Deployment Reflects Maturity
Salesforce frequently points to its internal use of Agentforce as evidence of its effectiveness, noting that AI now handles a substantial portion of customer service interactions. While some interactions appear more structured than conversational, the company says this reflects intentional design choices aimed at improving resolution rates and customer outcomes.
According to Salesforce, improved observability and feedback loops allow teams to quickly identify where AI responses are too broad or off-topic and refine them accordingly.
Update : Updated the article with input from Salesforce spokesperson




