Agentforce Guide to Achieving Reliable Agent Behavior
A Framework for 5 Levels of Determinism
A Framework for 5 Levels of Determinism
Olfa Kharrat, Director of Product Management - Agentforce
Reinier van Leuken, Senior Director of Product Management - Agentforce
Agentforce building blocks and agentic reasoning
Defining levels of agentic control
Level one agentic control: Reasoning with instruction-free topic & prompt action selection
Level two agentic control: instructions
Level three agentic control: Grounding
Level four agentic control: variables
Trust has been the #1 value at Salesforce ever since it was founded in 1999, pioneering a new technology model of cloud computing and SaaS. Businesses bestow their trust in Salesforce by storing valuable company data in the cloud, knowing this data is safe and governed by the appropriate access controls. That is still critical, but in the age of agentic AI, the definition of trust is even wider. As companies rely increasingly on autonomous agents to perform critical business functions, agents must become trusted business partners whose work is accurate, relevant and, most of all, reliable.
So how does one build a reliable agent? Reliability typically means providing the same result for the same input. However, agents don’t necessarily work like that, because they are powered by large language models (LLMs), which are non-deterministic by nature. That gives agents the fluidity to develop creative solutions tailored to specific circumstances, without needing to explicitly program each condition or situation they encounter. However, agents also need governance to comply with business requirements and adhere to operational guidelines. When executing business processes, they must demonstrate reliability and produce business outcomes that conform to deterministic constraints. Determinism imposes a rigidity and discipline that clashes with the autonomy and fluidity that agents provide. Therefore, the key to successful agent creation is to strike the right balance between creative fluidity and enterprise control.
This document addresses key considerations for developing reliable agents. It defines five levels of agentic control and provides best practices for gaining and keeping control over agentic behaviour for each of these levels. Guidance addresses the ways in which the Agentforce reasoning engine works. As Agentforce grows, this document will be updated to reflect the latest best practices.
This document assumes basic familiarity with designing and building Agentforce agents. For an introduction to Agentforce, we recommend the following:
To understand agentic behaviour better, let’s first compare agents with their rigid counterparts: chatbots.
Chatbots: Rigid Rule-Followers
Chatbots follow pre-determined decision trees that structure the dialogs they can participate in. Traversal through these decision trees is based on the answers given by the user. This answer may be a selection from a predetermined set of options, or it may be a free text answer. In case of a free text answer, a predictive model is used for intent classification. These trees map out all potential conversational pathways and dictate the chatbot's responses at each step. The chatbot's behavior is rigidly determined by pre-set rules. If a user's input doesn't match a recognized path, or if the predictive model wasn't trained to recognize a certain intent, the chatbot fails to respond appropriately.
Agents: Adaptive and Intuitive
In contrast, agents leverage the power of LLMs and their advanced capabilities in natural language processing (NLP). LLMs enable agents to comprehend the intent behind a user's input, even if it's phrased in an unexpected way. Based on its understanding of intent, the agent can select the most appropriate action from among a range of possibilities. An agent can even formulate entirely new responses. This flexibility and adaptability set agents apart from their chatbot counterparts.
A Culinary Analogy
The difference between chatbots and agents can be likened to the contrast between a novice cook and a seasoned chef.
In summary, the fundamental difference between agents and chatbots lies in their adaptability and ability to handle unexpected input.
A distinct feature of an agent's intelligence lies in its ability to orchestrate and trigger the most suitable actions at the right moment. This flexibility eliminates the need to extensively program every potential user interaction.
In Agentforce, building an agent involves topics, actions, and natural language instructions and descriptions.
Topics are the "jobs-to-be-done" for the agent. Topics have attributes like a classification description, scope, and instructions that define each job and how it’s done. Topics contain actions that the agent can execute, along with instructions that govern how these actions are executed.
Actions are the predefined tasks the agent can perform to do its job. There are five different types of actions:
An agent’s definition contains natural language instructions that describe the agent’s assets and define the guidelines within which it must operate. Instructions are written for actions and topics.
These various building blocks, when built correctly, help an agent carry out its intended purpose while operating within the appropriate boundaries.
The Agentforce reasoning engine orchestrates these building blocks into the correct agentic behavior. It leverages the natural language instructions and descriptions defined in topics and actions. It’s built on ReAct, a novel reasoning paradigm for LLMs introduced in 2022 by Yao et al. This paradigm mimics human task management by reasoning about a problem, taking action, observing the results of the action, and repeating the cycle until task completion.
Salesforce agents adhere to this paradigm:
The reasoning engine uses LLMs at every reason and observe step. Depending on the action type, it can also use LLMs in the Act step.
This section outlines a layered approach to enhancing the determinism of agents. Each level builds upon the previous one, with increasing complexity and capabilities that establish more control over the agent’s behaviour.
The first level focuses on enabling agents to autonomously identify relevant topics, and then select appropriate actions using goals rather than explicit instructions. The core mechanism involves using a contextual understanding to respond to user inputs. Although technically any action type can be added, at this level, we assume the actions to be prompt actions. Instruction-free topics with prompt actions provide a quick and efficient way to handle common queries.
At this level, the emphasis is on establishing a baseline level of agent responsiveness and autonomy through dynamic understanding.
Building upon the foundation of instruction-free action selection, this level introduces explicit instructions to guide agent behavior. Adding precise instructions increases control over how agents respond to different situations. Instructions to agents can be expressed as rules, guidelines, guardrails, and examples. These provide the agent with specific direction on how to handle various topics, execute actions, and process their outputs. The goal for this level is to provide clear guidance to the agent in order to increase consistency and improve adherence to company guidelines and processes.
Grounding involves connecting the agent's understanding and responses to external knowledge sources. Grounding helps ensure that the information provided by the agent is more accurate, up-to-date, and relevant. This level integrates access to databases, knowledge bases, and other information repositories. Grounding the agent's responses in verified data enhances its reliability and trustworthiness.
This level adds the capability for agents to work with variables. Variables allow agents to personalize interactions, retain context across multiple interactions, and dynamically adjust their behavior based on specific data points maintained during the agent session. For example, an agent could capture user preferences, order details, and other relevant information, and then use that data to tailor the interaction. With variables, agents are better able to handle more complex, more prescribed, and more personalized interactions.
The final step integrates the agent with Salesforce's core functionalities: Apex, APIs, and flow. Integration allows the agent to perform complex actions within the Salesforce ecosystem, such as accessing and manipulating data, triggering workflows, and interacting with other systems.
This level transforms the agent into a powerful tool capable of executing sophisticated tasks and contributing directly to business outcomes.
Beginning with a baseline of agent responsiveness and autonomy, consider an agent that consists only of topics and actions, with their corresponding descriptions. We can use this example agent to introduce the different steps of the reasoning engine and to show how it leverages these descriptions to select the right topics and then actions to execute. By omitting topic instructions from this example, we can observe that agents in this first level have the largest degree of freedom when compared with agents at higher levels. In level one, the agent is completely free to select the action it thinks is appropriate, based solely on the ongoing conversation.
Activity | Steps | Description |
---|---|---|
Agent Invocation | 1 | Agent is invoked. |
Topic Classification | 2-3 | The engine analyzes the customer's message and matches it to the most appropriate topic based on the topic name and classification description. |
Context Assembly | 4-5 | Once a topic is selected, the engine gathers the topic's scope, instructions, and available actions along with the conversation history. (Note: Instructions are covered in level two, Agentic Control.) |
Decision Making |
Using all this information, the engine determines whether to: • Run an action to retrieve or update information • Ask the customer for more details • Respond directly with an answer |
|
Action Execution | 6-8 | If an action is needed, the engine runs it and collects the results. |
Action Loop | The engine evaluates the new information and decides again what to do next—whether to run another action, ask for more information, or respond. | |
Grounding Check | Before sending a final response, the engine checks that the response: • Is based on accurate information from actions or instructions • Follows the guidelines provided in the topic's instructions • Stays within the boundaries set by the topic's scope |
|
Send Response | The grounded response is sent to the customer. |
Review the following considerations for the Reasoning Engine:
The reasoning process involves four main steps:
Topics are designed to improve the accuracy with which agents classify the right action or sequence of actions. Each topic should consist of semantically distinct actions that can all belong to a concise topic description, and thus belong to a similar agent function.
The right topic is selected by the reasoning engine LLM (step 2-3 of the diagram). It selects the topic whose classification description matches the last utterance most closely, using a topic prompt. This topic prompt contains the classification descriptions of all topics and the conversation history. In addition to utterances and agent replies, the conversation history includes executed actions and their outcomes. Furthermore, the prompt incorporates crucial instructions that mandate analysis within the context of the conversation history, and require the LLM to share its thought process.
Additional considerations:
The purpose of topics is twofold:
By carefully organizing agent capabilities into clearly defined topics made up of related actions, the agent operates more effectively and predictably, and it’s easier to update and expand. There are two possible approaches to topic design: top-down and bottom-up.
Both approaches lead to good results when followed properly.
Begin by listing all the specific actions the agent should be capable of performing. At this stage, it's better to be very specific rather than too general. Avoid trying to group or simplify actions prematurely. The goal is to create a comprehensive and granular view of what the agent can do.
For example, in the case of a customer service agent, the initial list might include:
Note that an action like “Resolve customer complaints” is too broad at this point. Actions should represent the smallest level of granularity in agent behavior. Complaints can be of many types, and different actions already cover them:
Mark actions that are similar in nature because they may cause confusion for the reasoning engine. Their descriptions won’t be sufficiently different semantically, and therefore the reasoning engine won’t know which action to select in step 5.
For example “Troubleshooting tech problems” and “Answer Question with Knowledge” have similar descriptions, but their functionality may differ significantly. Marking such semantic overlaps will help to identify which actions to separate across multiple topics.
Once actions are clearly defined and their semantic overlaps have been identified, actions can be grouped into preliminary topics. A topic is a logical category of functionality—a grouping of actions that together represent a coherent capability or skill of the agent.
When creating these groupings:
Here’s an example of an initial grouping for a customer service agent:
Topic 1:
Topic 2:
Once you have the initial grouping, write classification descriptions for each topic.
After refining, we get:
To recap, first create a comprehensive list of all possible actions, then mark semantic overlap between these actions. Next, create a set of topics that, at a minimum, solves for all semantic overlap (so that the reasoning engine won’t get confused within the confines of one topic). Then write every topic’s classification description. If the topics are too broad in scope, break them up into more granular topics. By implementing this guidance, you can build an agent that not only performs well, but is also easy to maintain and extend.
This structure supports better reasoning, more accurate execution, and clearer decision boundaries within the agent’s behavior. It also relies on collaboration among designers, engineers, and subject-matter experts to make the agent’s capabilities more transparent and modular.
Further considerations for effective topics creation
Imagine a service agent receives a warranty policy request for a watch. The warranty issue does not appear to be related to product exchange or support. Order management seems to be the most appropriate topic to address this request. Hence the latter topic is chosen by the reasoning engine as the most probable one to fulfill the request.
After topic selection, the reasoning engine selects the right actions to execute from the selected topic. Again, the reasoning engine LLM is responsible for this, using another prompt called the observation prompt. The purpose of the observation prompt is to obtain the next step in the reasoning process. This next step can be any of the following:
The input to the observation prompt is formed by all the descriptions of all actions from the topic, as well as from the conversation history.
Actions are the predefined tasks the agent can perform to do its job. Actions are the most fine-grain definitions of work. There are five different types of agent actions: (1) execute Apex code, (2) call an API, (3) execute a flow, (4) get an LLM response to a prompt template, and (5) call a predictive model. Most of these actions can be deterministic. An exception is getting a response to a prompt template (provided that the external system, Flow or Apex action doesn’t contain probabilistic elements like prompt invocations). We will cover these issues in the fifth level of agentic control.
Let’s continue with the previous example in which a service agent received a question regarding the warranty policy for a watch. After selecting the Order Management topic, the most probable action is chosen. Because this is the order management topic, the first logical step is to look up the order (otherwise, what is there to retrieve warranty information for?) by launching the lookup order action.
A user utterance can trigger the execution of multiple actions before an answer is sent back to the user. This is due to the agentic loop, which continues to select and execute the next most suitable action until one of the following conditions is met:
Actions are not subject to a specific timeout. This is to avoid interruption when action execution times vary based on their complexity. Some actions are simply more complex to execute than others.
After initiating an order lookup, the reasoning engine evaluates the response generated so far, and then decides it needs to do more work before an answer can be sent back to the user. It’s about to look up the warranty policy, now that the order is present in the conversation history.
However, in doing so, the agent realizes that the customer has actually purchased two watches, as retrieved by the ‘order lookup’ action. Therefore, in the agentic loop, the reasoning engine now decides to ask the customer to specify which particular watch they need warranty information for.
Agent reliability is enhanced by a careful distribution of actions across topics, and well-described actions and topics. However, these methods don’t allow for the expression of business rules, policies, and guardrails within the reasoning engine. Instructions provide an additional important layer of agentic control. Instructions offer further guidance to the reasoning engine when using various actions together. This allows for a more nuanced and policy-driven approach to agent behavior. With instructions, agent builders can ensure that agents not only function reliably, but also adhere to established business rules and best practices.
The instructions that are written at the topic level become part of the observation prompt. Topic instructions guide the reasoning engine in choosing appropriate actions. They can provide guidance on when to select which action, and they can be used to define action dependence. In certain circumstances, they can also enforce sequential control. However, alternatives exist for this and instructions should be used carefully for that requirement. Topic instructions are added one by one and appear in separate boxes in the UI. However, they are always sent to the observation prompt together. Adding instructions in separate boxes increases readability and maintainability of the topic, but it doesn’t affect the reasoning engine.
Sometimes, instructions apply globally to the agent and are not related to an individual topic. Functionality to maintain global instructions is currently on the product roadmap. Best practices for topic instruction writing can be found in the Agentforce Guide to Topics, Instructions, and Actions. Let’s review some additional guidelines.
Avoid overly scripting the ways in which agents should have conversations with users. Overscripting can stifle an agent's ability to build rapport, understand unique user needs, and respond effectively in real-time to dynamic circumstances. Moreover, lengthy instructions can make the agent slower to respond and confuse the reasoning engine. Forcing determinism via instructions is not the preferred approach.
For example, there is no need to tell an agent to avoid referring to competitors in service answers. This can lead to undesired behaviour, as the agent can also refuse to answer questions referring to integration with a provider who is also a competitor. Instead, the instruction can be something like “When discussing competitors, respond with the company’s best interest in mind.” This avoids restrictive, conditional instructions such as “Only mention competitor xyz in the case of…”, and instead exploits the reasoning capabilities of the LLM. This example shows how instructions can be given on a higher, more abstract level, similar to the way a human service employee would be trained after having joined the company.
Let’s look at some more examples of what not to do. These are some bad instructions given to a service agent handling candidate profiles on a recruitment website. These instructions attempt to anticipate every possible customer utterance and should therefore be avoided:
Instruction 1:
The agent receives the following utterance: “Can I add a picture to my profile?" then immediately ask the customer, "What is your profile type?"
Instruction 2:
If the customer indicates a premium profile then answer “Let me check your contract details” then Search for contract details and check if it was agreed that they can update the profile picture.
If it was agreed that the candidate can do it, then answer “yes, it is possible let me update for you. Can you provide your new picture?” Once the picture is received, then update the candidate’s profile accordingly. If the contract does not include profile picture changes, then say “Sorry, this is not possible. Let me route you to a human agent”
Instruction 3:
Non-Premium Profile: If the customer indicates a non-premium profile, respond with, "You cannot update your picture. If you would like to do so, please let me know and I will transfer you to a human agent."
Instruction 4:
If the profile type is unclear, respond with, "I could not understand your profile type."
Instead of this type of micromanagement, use a more flexible approach that instructs agent behavior and conduct. Consider the following best practices:
"Only candidates with a premium profile whose contract allows for picture changes can update their picture
."Based on these best practices, a better set of instructions might look like this:
Instruction 1
: “Use knowledge actions to check for policies in case of requests for account changes.”
Instruction 2
: “Do not respond to questions for which no applicable policy could be found.”
Applying the above guidelines can improve the agent results. Now, if the customer asks the agent for a profile change, the agent will understand that it needs to retrieve the required policy from the knowledge base, interpret the retrieved rules, apply those rules to the context, and finally respond to the customer. In contrast to overscripting, this behavioral approach is much more generic and widely applicable. Without having to write out each possible conversation, the agent can now flexibly respond with the desired behavior to a broader range of conversation topics.
Let’s continue with the example of recruitment website agents. The agent should be able to handle interview planning with the appropriate interviewer. To do so, the agent should first check for the availability of the recruiters, and then propose three possible slots to the candidate.
In this case, in order to keep the order of execution, the instructions should not be in separate boxes:
Check for interviewers availabilities.
Then propose appropriate slots to the candidate.
These instructions don’t work because the reasoning engine does not know what the “Then” statement in instruction 2 refers to. This is because the instructions are sent to reasoning engine as a group, not in any particular order.
Instead, sequence-defining instructions should be combined into one statement and written as:
Check for interviewers' availability. Then propose appropriate slots to the candidate.
However, when the expectation is that only one prompt action has been executed, an instruction can be implemented to instruct the agent to never change the output of an action. Doing so ensures more predictable and reliable agent behavior.
Enforcing this strict adherence in approved prompt templates becomes crucial in certain scenarios, particularly when consistency, compliance, and pre-defined messaging are important. These are two examples:
This instruction limits the agent's freedom to change the output of actions. Make sure that the instruction references the output of the prompt template (such as “promptResponse”), as shown in this Plan Tracer.
So the instruction in this case can be:
“Do not change promptResponse output, regardless of the channel of the agent.
”
Limitations on enforcing strict adherence:
When an interaction requires multiple distinct agent actions, enforcing strict adherence to a single template isn’t feasible. In fact, in this scenario, the reasoning engine needs to consolidate these actions into a single response, and therefore change every single action output.
Based on general LLM characteristics, the target number of instructions ranges between 5 and 10, depending on the instruction complexity and instruction interaction. These instruction characteristics influence the number of instructions the reasoning engine can follow:
If an instruction is very important to follow explicitly, then add terms that reflect their importance:
Grounding answers in data significantly improves agent reliability and trustworthiness. Grounded responses are based on factual information rather than speculation or outdated knowledge. Retri eval augmented generation (RAG) is a widely adopted technique that allows an agent to access and use a knowledge base in order to formulate more accurate and contextually relevant answers. Based on a user’s query, an agent uses RAG to retrieve relevant information from applicable data sources, and then augments the prompt with this information before submitting it to the LLM. An agent using RAG has higher quality, accuracy, and overall utility of agent interactions, which increases user confidence and satisfaction. Best practices for RAG are extensively described in a publicly available white paper called Agentforce and RAG: best practices for better agents .
Differentiating between knowledge and instructions is important when striking the right balance between guidance and flexibility, as they fulfill different purposes:
Retrieval-Augmented Generation (RAG) acts as an intelligent data layer for knowledge. It gives agents access to information across various formats and provides relevant text fragments for answering questions. With RAG, agents can get more accurate LLM responses without overwhelming the LLM prompt with extraneous content or exceeding its context window.
At run time, RAG executes three steps:
In Agentforce, RAG can be used with or without a prompt template:
The recommended method is option 1. It reduces the number of tasks the reasoning engine should perform, and thereby improves its answer quality. The next section explores an exception to this rule, in which the content is preserved throughout the conversation and hence is given to an action explicitly.
Storing RAG output in a variable: When the number of interaction limits might be reached, store the RAG output in a variable. This keeps the information accessible for guiding agent interactions beyond the standard threshold. An example of this will be provided in the next section.
Certain business processes demand even more predictable execution, such as enforcing a specific action sequence or conditions for triggering actions or topics.
To achieve this deterministic behavior, variables can be used. Variables function as a structured form of short-term agent memory that can serve as action inputs or outputs. Furthermore, the state of a variable can govern the triggering of specific topics and actions.
Variable types support the following capabilities:
Context Variables | Custom Variables | |
---|---|---|
Can be instantiated by user | X | ✓ |
Can be Input of Actions | ✓ | ✓ |
Can be output of Actions | X | ✓ |
Can be updated by actions |
X | ✓ |
Can be used in filters of actions and topics | ✓ | ✓ |
Let’s dig into variables further with an example use case: a customer-facing troubleshooting agent. In this example, variables are used for all three purposes: persisted dynamic grounding, action inputs/outputs, and filtering.
In this example, the agent helps a customer troubleshoot a technical device issue. Troubleshooting typically involves going through a number of steps. The agent should offer a service experience that mimics the work of a human service agent. To do so, the agent shouldn’t provide all the troubleshooting steps at once to the customer. Instead, it should offer step-by-step instructions, along with the ability to navigate between steps (including going back to previously covered steps) based on how the customer responds.
One challenge with this is the agent’s capacity to retain all troubleshooting steps throughout the conversation. Given the agent’s limited memory due to the restricted number of interactions it can store, these steps can be dropped from the context for the reasoning engine if the conversation becomes lengthy.
The way to address this challenge is to use a variable to ground the reasoning engine dynamically across troubleshooting steps. By retrieving the information and storing it in a variable, it remains available, and can be updated, throughout the conversation. The reasoning engine uses the information stored in this variable for dynamic grounding.
In this example, a topic includes two actions. These two actions are needed to maintain a consistent data flow. The first action is used to populate the variable that contains all of the troubleshooting steps. The second action uses that variable during the troubleshooting itself.
The original customer question is input to both actions. The second action has another input: the contents of the variable ‘Resolution Steps’. This variable was set by the first action. Note that the second action won’t retrieve the troubleshooting steps itself, but will instead get them as input from the first action via the variable. The following diagram depicts the data flow between those two actions.
The ‘Use in the Middle of Solving an Issue’ action will always refer to the original troubleshooting steps retrieved by the Issue Resolution Steps action. This data flow ensures that troubleshooting steps are maintained coherently and always present, independent of the conversation length.
To execute the actions defined in this example, specific instructions are needed, such as “Always execute the ‘Populate resolution steps’ first.” However, given the non-deterministic nature of LLMs used by agents, this can lead to a different order in certain cases. To ensure a deterministic order of execution, we introduce conditional filters on these variables to enforce the proper action sequence. The agent reads the value of the variable ‘Resolution Steps’ and defines two filters based on whether this variable has a value or not.
These conditional filters now deterministically enforce the sequence of action execution: ‘Use in the Middle of Solving an Issue’ must wait until ‘Issue Resolution Steps’ completes its work, thereby guaranteeing that the‘Resolution Steps’ variable always has a value.
To ensure correct action execution, a third action is needed to reset the ‘Resolution Steps’ variable if the issue is fully solved. As a result, the agent is reset into the required state to help with a possible new, different issue. This third action is called ‘Empty the Resolution Variable’. The full action diagram is depicted below.
Variables are crucial to enabling our troubleshooting agent to solve customer problems by allowing for:
In the era of generative AI, predictive AI remains critically important in that it forms the foundational intelligence that guides, enhances, and contextualizes generative capabilities. While generative AI focuses on creating new content — such as text, images or videos — predictive models make predictions about the future based on inputs from real-time business data. Example business outcomes include customer likelihood to churn, conversion likelihoods, case escalation probability, customer lifetime value, and case classification. Predictions can help anticipate user needs, personalize outputs, enact decisions, optimize content relevance in real time - all by analyzing trends and numbers. For example, in applications like personalized learning, healthcare, or financial planning, predictive AI ensures generative outputs align with individual contexts and likely future scenarios. Together, predictive and generative AI create a powerful synergy, merging foresight with creativity to drive more intelligent, adaptive, and impactful technology solutions.
To incorporate predictive model outputs into agent workflows, simply add predictive model actions to the Agentforce assets. Model Builder provides the means to build or register (BYO) predictive models, and these models are then used by the agent to make predictions. The resulting predictions (as well as predictors) can be stored in custom variables. Agents can use these variable values as inputs to, and conditionalize the execution of, specific actions and topics.
Certain business processes need to be executed in a precise order and do not require user input during the execution. In this case, a predetermined flow of steps can be enforced via flows, APIs, or Apex. If you have an existing flow that’s relied on in production, it’s a good indication that it can be kept and used by the agent for the execution of that business process. All of the following examples include predetermined sequences of steps that the agent can execute without user input. The agentic behavior in this case consists of identifying which deterministic process to execute, how to gather the necessary inputs, and how to interpret and process the outputs.
Business processes with many sequential steps (more than three, as a rule of thumb) and many dependencies on variables become too complex and cumbersome to enforce with instructions. In this case, it is possible to simply hardcode them using the deterministic action types listed in this section. Finally, note that these implementations can include non-deterministic elements, such as calling LLMs with resolved prompt templates. Therefore, they are not necessarily completely deterministic, end-to-end, and they can still demonstrate the desired levels of fluidity that agents are known for.
The sequence of steps in a marketing journey is conditioned by fixed rules and does not depend on any conversational user input. Therefore, the flow can be used as an Agentforce action. An invocable action can be created to complete background or event-triggered tasks from a solution component that can call a flow or Apex class. Add an invocable action to a flow or Apex class and specify the task that the agent completes, as well as the conditions that trigger the agent. Invocable actions can also carry the context variables of the agent and pass along important information.
Salesforce flows can be used to automate routine tasks, like creating follow-up tasks, sending reminder emails, or updating records. Flows make work more efficient and productive. Agents can also execute flows using flow actions. Because of their determinism, flows are a great way to direct agentic behavior when a business process needs to be executed in a particular sequence. A good indication that a flow action is preferred is when the topic would otherwise contain instructions such as “First do this, then do this, and finally do this”. Enforcing sequences of more than three steps becomes cumbersome to manage via instructions and variables.
Flows can also include non-deterministic elements by calling prompts. A prompt node in flow invokes a prompt template and collects the response that can be passed on to other elements in the flow. These further elements can again be prompt nodes, for example summarizing the previous response, thereby creating a chain of prompts. This is particularly useful when the rules for prompt chaining are defined by fixed elements and don’t depend on user input. One example is agentic RAG in which a predefined sequence of retrievers or prompts in a flow can access specific data sources in a particular order, such as initially retrieving data from a user's country document before consulting other sources as needed. This chaining mechanism enforces a reliable and ordered extraction of relevant information.
Similar to flows, Apex and API actions are deterministic in that a predefined sequence of actions can be coded. These actions can include non-deterministic elements, such as invoking prompt templates or LLM calls. However, in their definition, they execute these steps deterministically, which reduces agentic variability by calling the action at the right time, collecting the necessary input, and processing the output. These responsibilities still need to be governed by agentic instructions, so they’re non-deterministic. Apex and API actions are the pro-code equivalent of flow actions.
Achieving reliable agent behavior requires a structured approach that balances the inherent flexibility of Large Language Models (LLMs) with the need for enterprise-level control and predictability. This article outlined a layered strategy for implementing "guided determinism" enabling the creation of agents that are not only intelligent and autonomous, but also consistently accurate and aligned with business processes. The key to building these trusted agents lies in a progressive implementation of control mechanisms, each adding a new layer of reliability:
By systematically applying these layers of control — from thoughtful design and clear instruction to data grounding, state management, and deterministic process automation — developers can successfully navigate the challenges of building reliable agents with consistent business outcomes. This strategic approach ensures that Agentforce agents can be trusted to perform critical business functions with the accuracy and consistency required in the enterprise landscape.
The five levels of determinism in AI are: instruction-free topic and action selection, agent instructions, data grounding, agent variables, and deterministic actions using flows, Apex, and APIs.
Understanding AI determinism is crucial for building reliable agents that can perform critical business functions accurately and consistently, striking a balance between creative fluidity and enterprise control.
In AI, "deterministic" refers to the ability of a system to produce the same output given the same input and conditions, imposing a rigidity and discipline essential for reliable agent behavior.
Non-determinism in AI systems arises primarily due to the use of Large Language Models (LLMs), which are non-deterministic by nature, allowing agents to be flexible and adaptive.
The levels of determinism progressively enhance the determinism of AI agents, thereby affecting their autonomy - as the levels progress, agents become less autonomous but more reliable and aligned with business processes.
Less deterministic AI systems present challenges in terms of reliability and compliance with business requirements, as their inherent non-determinism can lead to unpredictable behavior.
Businesses manage AI systems with varying levels of determinism by applying a layered approach that includes thoughtful design, clear instructions, data grounding, state management through variables, and deterministic process automation using flows, Apex, and APIs.
Take a closer look at how agent building works in our library.
Launch Agentforce with speed, confidence, and ROI you can measure.
Tell us about your business needs, and we’ll help you find answers.