Agentforce Guide to Achieving Reliable Agent Behavior: A Framework for 6 Levels of Determinism

Olfa Kharrat, Director of Product Management - Agentforce
Reinier van Leuken, Senior Director of Product Management - Agentforce

Agents Versus Chatbots

Agentforce Building Blocks and Agentic Reasoning

Defining Levels of Agentic Control

Level 1 Agentic Control: Reasoning with Instruction-Free Topic & Prompt Action Selection

Level 2 Agentic Control: Instructions

Level 3 Agentic Control: Grounding

Level 4 Agentic Control: Variables

Level 5 Agentic Control: Deterministic Actions

Level 6 Agentic Control: Deterministic Control with Agent Script

Conclusion & FAQs

Introduction

Trust has been the #1 value at Salesforce ever since it was founded in 1999, pioneering a new technology model of cloud computing and SaaS. Businesses bestow their trust in Salesforce by storing valuable company data in the cloud, knowing this data is safe and governed by the appropriate access controls. That is still critical, but in the age of agentic AI, the definition of trust is even wider. As companies rely increasingly on autonomous agents to perform critical business functions, agents must become trusted business partners whose work is accurate, relevant and, most of all, reliable.

So how does one build a reliable agent? Reliability typically means providing the same result for the same input. However, agents don’t necessarily work like that, because they are powered by large language models (LLMs), which are non-deterministic by nature. That gives agents the fluidity to develop creative solutions tailored to specific circumstances, without needing to explicitly program each condition or situation they encounter. However, agents also need governance to comply with business requirements and adhere to operational guidelines. When executing business processes, they must demonstrate reliability and produce business outcomes that conform to deterministic constraints. Determinism imposes a rigidity and discipline that clashes with the autonomy and fluidity that agents provide. Therefore, the key to successful agent creation is to strike the right balance between creative fluidity and enterprise control.

This document addresses key considerations for developing reliable agents. It defines six levels of agentic control and provides best practices for gaining and keeping control over agentic behaviour for each of these levels. Guidance addresses the ways in which the Agentforce reasoning engine works. As Agentforce grows, this document will be updated to reflect the latest best practices.

This document assumes basic familiarity with designing and building Agentforce agents. For an introduction to Agentforce, we recommend the following:

Review the resources on agentforce.com.
Follow the trail Become an Agentblazer Champion . This trail explores core AI concepts and helps you build a basic agent for key tasks.
Read more about Agentforce on help.salesforce.com , specifically Design and Implement Agents . This learning map guides you through all the key lifecycle steps, from ideating your solution, setting up and configuring your agent, testing, deploying and monitoring.

Agents Versus Chatbots

To understand agentic behaviour better, let’s first compare agents with their rigid counterparts: chatbots.

Chatbots: Rigid Rule-Followers

Chatbots follow pre-determined decision trees that structure the dialogs they can participate in. Traversal through these decision trees is based on the answers given by the user. This answer may be a selection from a predetermined set of options, or it may be a free text answer. In case of a free text answer, a predictive model is used for intent classification. These trees map out all potential conversational pathways and dictate the chatbot's responses at each step. The chatbot's behavior is rigidly determined by pre-set rules. If a user's input doesn't match a recognized path, or if the predictive model wasn't trained to recognize a certain intent, the chatbot fails to respond appropriately.

Agents: Adaptive and Intuitive

In contrast, agents leverage the power of LLMs and their advanced capabilities in natural language processing (NLP). LLMs enable agents to comprehend the intent behind a user's input, even if it's phrased in an unexpected way. Based on its understanding of intent, the agent can select the most appropriate action from among a range of possibilities. An agent can even formulate entirely new responses. This flexibility and adaptability set agents apart from their chatbot counterparts.

A Culinary Analogy

The difference between chatbots and agents can be likened to the contrast between a novice cook and a seasoned chef.

The novice cook (chatbot) relies heavily on a detailed recipe, complete with precise measurements, step-by-step instructions, and specific cooking times. Any deviation from the recipe results in a culinary disaster. Similarly, a chatbot must function within the confines of its pre-programmed decision tree.
The seasoned chef (agent) possesses years of culinary experience and intuition. Armed with a general understanding of your preferences and a brief description of available ingredients, they can whip up a delicious meal that caters to your needs. The exact steps might vary each time, and each version of the dish might have subtle differences, but the overall outcome is consistently satisfying. Likewise, an agent can adapt its approach based on the context and intent of the user's input, resulting in a successful interaction.

In summary, the fundamental difference between agents and chatbots lies in their adaptability and ability to handle unexpected input.

Agentforce Building Blocks and Agentic Reasoning

A distinct feature of an agent's intelligence lies in its ability to orchestrate and trigger the most suitable actions at the right moment. This flexibility eliminates the need to extensively program every potential user interaction.

Building Blocks

In Agentforce, building an agent involves topics, actions, and natural language instructions and descriptions.

Topics

Topics are the "jobs-to-be-done" for the agent. Topics have attributes like a classification description, scope, and instructions that define each job and how it’s done. Topics contain actions that the agent can execute, along with instructions that govern how these actions are executed.

Actions

Actions are the predefined tasks the agent can perform to do its job. There are five different types of actions:

execute Apex code
call an API
execute a flow
get an LLM response to a prompt template
call a predictive model

Natural language instructions and descriptions

An agent’s definition contains natural language instructions that describe the agent’s assets and define the guidelines within which it must operate. Instructions are written for actions and topics.

Actions. An action contains:
- Instructions that describe what the action does, which tells the reasoning engine when to execute this action.
- Inputs with a natural language description so that the agent can prepare them.
- Outputs with a natural language description of how to format and use them.
Topic. A topic contains instructions that govern how to execute its actions on a higher level. For example, instructions can specify guardrails about tone of voice, desired sequence of actions, possible prerequisites, or when to escalate conversations to humans. The topic also contains a classification description and a scope delineation. Altogether this ensures that the agent stays within the scope of its defined role and performs the job.
Data. Agents need data to succeed at their jobs. Data can be structured, such as CRM data, or unstructured, such as company knowledge articles. Agents access data using action inputs. For example, an action can call a prompt template that is grounded in CRM data, or augmented with chunks of unstructured data using RAG techniques.

These various building blocks, when built correctly, help an agent carry out its intended purpose while operating within the appropriate boundaries.

Flowchart graphic showing the Agentforce building blocks.

Reasoning Engine

The Agentforce reasoning engine orchestrates these building blocks into the correct agentic behavior. It leverages the natural language instructions and descriptions defined in topics and actions. It’s built on ReAct, a novel reasoning paradigm for LLMs introduced in 2022 by Yao et al. This paradigm mimics human task management by reasoning about a problem, taking action, observing the results of the action, and repeating the cycle until task completion.

Salesforce agents adhere to this paradigm:

Reason: Comprehend user intent and align it with the correct topic and action(s).
Act: Launch the correct chain of actions.
Observe: Evaluate the action results against user intent. If the intent is not fulfilled, reason further based on the result obtained so far, and on the topic/action instructions and descriptions. If the intent is fulfilled, provide the final response while adhering to possible formatting instructions.
Repeat: Iterate these steps until the final instructed step is reached.

The reasoning engine uses LLMs at every reason and observe step. Depending on the action type, it can also use LLMs in the Act step.

Defining Levels of Agentic Control

This section outlines a layered approach to enhancing the determinism of agents. Each level builds upon the previous one, with increasing complexity and capabilities that establish more control over the agent’s behaviour.

Graphic showing the control levels for increased agent behavior.

1. Reasoning with Instruction-Free Topic & Prompt Action Selection

The first level focuses on enabling agents to autonomously identify relevant topics, and then select appropriate actions using goals rather than explicit instructions. The core mechanism involves using a contextual understanding to respond to user inputs. Although technically any action type can be added, at this level, we assume the actions to be prompt actions. Instruction-free topics with prompt actions provide a quick and efficient way to handle common queries.

At this level, the emphasis is on establishing a baseline level of agent responsiveness and autonomy through dynamic understanding.

2. Agent Instructions

Building upon the foundation of instruction-free action selection, this level introduces explicit instructions to guide agent behavior. Adding precise instructions increases control over how agents respond to different situations. Instructions to agents can be expressed as rules, guidelines, guardrails, and examples. These provide the agent with specific direction on how to handle various topics, execute actions, and process their outputs. The goal for this level is to provide clear guidance to the agent in order to increase consistency and improve adherence to company guidelines and processes.

3. Data Grounding

Grounding involves connecting the agent's understanding and responses to external knowledge sources. Grounding helps ensure that the information provided by the agent is more accurate, up-to-date, and relevant. This level integrates access to databases, knowledge bases, and other information repositories. Grounding the agent's responses in verified data enhances its reliability and trustworthiness.

4. Agent Variables

This level adds the capability for agents to work with variables. Variables allow agents to personalize interactions, retain context across multiple interactions, and dynamically adjust their behavior based on specific data points maintained during the agent session. For example, an agent could capture user preferences, order details, and other relevant information, and then use that data to tailor the interaction. With variables, agents are better able to handle more complex, more prescribed, and more personalized interactions.

5. Apex, API, and Flow Actions

This step integrates the agent with Salesforce's core functionalities: Apex, APIs, and flow. Integration allows the agent to perform complex actions within the Salesforce ecosystem, such as accessing and manipulating data, triggering workflows, and interacting with other systems.

Apex provides programmatic control.
APIs enable seamless integration with other applications
Flows allow for the automation of complex business processes.

This level transforms the agent into a powerful tool capable of executing sophisticated tasks and contributing directly to business outcomes.

6. Agent Script

Building on the technical integrations of level 5, this final level introduces deterministic reasoning to bridge the gap between probabilistic AI and rigid business logic. While previous levels rely on the LLM to decide which tool to use, Agent Script allows you to "hard-code" the reasoning process itself. By using a document-style canvas or direct code, you can define immutable paths—such as mandatory authentication gates, if/else conditional branching, and forced topic transitions—that the agent must follow regardless of user input. This hybrid reasoning approach allows you to sandwich the conversational flexibility of an LLM between layers of guaranteed execution. Level 6 transforms the agent into a zero-trust, enterprise-grade partner capable of handling high-stakes compliance, regulatory disclosures, and complex multi-step dependencies with absolute precision.

Level 1 Agentic Control: Reasoning with Instruction-Free Topic & Prompt Action Selection

Beginning with a baseline of agent responsiveness and autonomy, consider an agent that consists only of topics and actions, with their corresponding descriptions. We can use this example agent to introduce the different steps of the reasoning engine and to show how it leverages these descriptions to select the right topics and then actions to execute. By omitting topic instructions from this example, we can observe that agents in this first level have the largest degree of freedom when compared with agents at higher levels. In level one, the agent is completely free to select the action it thinks is appropriate, based solely on the ongoing conversation.

Flow chart graphic showing a high-level decision tree of the Agentforce Reasoning Engine.

Activity	Steps	Description
Agent Invocation	1	Agent is invoked.
Classify Topic	2-3	The engine analyzes the customer's message and matches it to the most appropriate topic based on the topic name and classification description. Agent Script transforms the Topic Selector into a fully configurable element, eliminating the "black box" of probabilistic LLM routing. By treating navigation as a programmable topic, you gain absolute transparency and control, allowing you to align the agent’s decision-making logic precisely with your specific business requirements and architectural standards.
Execute Topic’s Agent Script and Build Instructions / Resolve Instructions & Available Actions	4-5	Execute scripted actions as dictated by instructions. These are actions that should be executed once a topic is chosen, before the system proceeds to evaluate the non-deterministic instructions or the rest of the conversational context.
Prompt and Conversation History Sent to LLM	6	Once all scripted actions are executed, a prompt with the topic scope, instructions, and available actions along with the conversation history are sent to LLM. Note: Instructions are covered in level 2, Agentic Control.
LLM Decides to Respond or Run an Action	7	Using all this information, the engine determines whether to: • Run an action to retrieve or update information • Ask the customer for more details • Respond directly with an answer If the LLM decided to respond, step 12 is executed.
Action Execution	8-9	If an action is needed, the engine runs it and collects the results.
Run After-Action Logic	10	Only applicable with Agent Script: With Agent Script, actions can have deterministic transitions to other actions or topics. Those will always be executed after the action is executed.
Action Output Returned + Action Loop	11	The engine evaluates the new information and decides again what to do next — whether to run another action, ask for more information, or respond.
Grounding Check - LLM Responds to Client	12	Before sending a final response, the engine checks that the response: • Is based on accurate information from actions or instructions • Follows the guidelines provided in the topic's instructions • Stays within the boundaries set by the topic's scope Note: It's possible with Agent Script to add a step to format the final answer. The grounded response is sent to the customer.

Review the following considerations for the Reasoning Engine:

The configuration settings are fixed for the LLM of the reasoning engine. Agent builders cannot change them. Currently, the agent builder can choose between an LLM by OpenAI or an LLM by Anthropic hosted on Salesforce infrastructure for reasoning. This is subject to change as more models are added.
Reasoning engine default history: Whenever a request is made to the reasoning engine (steps 2-5), it automatically retrieves the history of the most recent requests and responses. This ensures that the conversation context is maintained for the reasoning engine. In addition to customer interactions, these calls to the planner LLM include calls to the reasoning engine LLM for other requests, such as topic selection.

Reasoning Steps

The reasoning process involves four main steps:

Topic Selection
Action Selection
Agentic Loop
Grounding Check

Reasoning Step 1: Topic Selection

Topics are designed to improve the accuracy with which agents classify the right action or sequence of actions. Each topic should consist of semantically distinct actions that can all belong to a concise topic description, and thus belong to a similar agent function.

The right topic is selected by the reasoning engine LLM (step 2-3 of the diagram). It selects the topic whose classification description matches the last utterance most closely, using a topic prompt. This topic prompt contains the classification descriptions of all topics and the conversation history. In addition to utterances and agent replies, the conversation history includes executed actions and their outcomes. Furthermore, the prompt incorporates crucial instructions that mandate analysis within the context of the conversation history, and require the LLM to share its thought process.

Additional considerations:

An additional hidden topic for "Off Topic" utterances automatically exists alongside the visible topics. This topic is selected when no other existing topic aligns with the utterance. This helps the agent to avoid misclassification. This topic has no actions. It exists only to help the reasoning engine formulate an appropriate response later on.
Only the name and the classification description of the topic are used in the topic prompt.
The reasoning engine LLM can choose only one topic at a time.
Conversations can pivot unexpectedly. Whenever an utterance is received, the reasoning engine proceeds to the topic selection step, which means that a new topic can be selected at each new utterance.

Best Practices for Topic Design

The purpose of topics is twofold:

Reduce the risk of confusing the reasoning engine by grouping actions, thereby avoiding that it selects the wrong actions.
Guide action selection and execution with instructions (more about that in Level Two: Agentic Control: add instructions).

By carefully organizing agent capabilities into clearly defined topics made up of related actions, the agent operates more effectively and predictably, and it’s easier to update and expand. There are two possible approaches to topic design: top-down and bottom-up.

In the top-down approach, topics are designed first as high-level jobs-to-be-done by the agent, and the individual actions are then defined for those topics.
In the bottom-up approach, first all the individual actions are defined that are then grouped together in topics.

Both approaches lead to good results when followed properly.

Bottom-Up Approach

This section walks through the bottom-up approach, as it aligns closely with the reasons why the reasoning engine needs topics to begin with.

Step 1: List all agent actions

Begin by listing all the specific actions the agent should be capable of performing. At this stage, it's better to be very specific rather than too general. Avoid trying to group or simplify actions prematurely. The goal is to create a comprehensive and granular view of what the agent can do.

For example, in the case of a customer service agent, the initial list might include:

Return an order: used to initiate the return process for an order.
Check inventory availability: used to verify if a product is in stock.
Check product exchange policies: used to retrieve information about exchange rules.
Answer questions with knowledge: used to answer general or FAQ-style questions.
Check promotions: used to check for any available promotions or discounts.
Predict delivery date: used to estimate the expected delivery date/time.
Check order status: used to find the current status of a customer’s order.
Find customer orders: used to retrieve all past or active orders by a specific customer.
Troubleshoot tech problems: used to resolve technical issues with a product or service.
Find customer order terms: retrieve all terms related to an order.
Find customer assets: used to identify or retrieve assets associated with a customer.
Change delivery address.

Note that an action like “Resolve customer complaints” is too broad at this point. Actions should represent the smallest level of granularity in agent behavior. Complaints can be of many types, and different actions already cover them:

Post-delivery problems, such as troubleshooting damaged or malfunctioning products, are already covered by “Troubleshoot tech problems”).
Pre-delivery problems, such as missing deliveries, the need to change delivery dates, or order modification, are already covered by actions like “Check order status,” “Predict delivery date,” or “Return an order.”
General customer concerns, such as policy inquiries, are already covered by “Answer questions with knowledge” or “Check product exchange policies.”

Step 2: Mark action pairs (or multiples) that possibly cause reasoning confusion

Mark actions that are similar in nature because they may cause confusion for the reasoning engine. Their descriptions won’t be sufficiently different semantically, and therefore the reasoning engine won’t know which action to select in step 5.

For example “Troubleshooting tech problems” and “Answer Question with Knowledge” have similar descriptions, but their functionality may differ significantly. Marking such semantic overlaps will help to identify which actions to separate across multiple topics.

Step 3: Create initial groupings of actions into topics

Once actions are clearly defined and their semantic overlaps have been identified, actions can be grouped into preliminary topics. A topic is a logical category of functionality—a grouping of actions that together represent a coherent capability or skill of the agent.

When creating these groupings:

Avoid semantic overlap by using the minimum number of topics needed.
Ensure that each topic contains actions that are meaningfully related.
Make sure that supporting actions that must be executed in a chain are present in the same topic.

Here’s an example of an initial grouping for a customer service agent:

Topic 1:

Return an order
Check inventory availability
Check product exchange policies
Check promotions
Predict delivery date
Find customer order terms
Check order status
Find customer orders
Find customer assets
Answer questions with knowledge
Change delivery address

Topic 2:

Troubleshoot tech problems
Find customer assets

Step 4: Write classification descriptions for topics and break up topics if needed

Once you have the initial grouping, write classification descriptions for each topic.

Topic 2 in our example clearly is about technical problems with products.
Topic 1, however, covers a broader scope. It’s largely about order management, but it contains some actions that are unrelated to order management as well, such as checking for promotions and answering questions with knowledge. Topic 1 can't clearly and concisely be described in a single sentence, and therefore, the topic should be broken up into different topics.

After refining, we get:

Topic 1: Order Management: Describes actions related to managing and modifying customer orders and related logistics, except anything related to exchange and return.
- Check inventory availability: Determine whether a product is in stock.
- Predict delivery date: Estimate when an order will arrive.
- Check order status: Find the status of a customer’s order.
- Find customer orders: Retrieve all orders placed by a customer.
- Change delivery address.
Topic 2: Troubleshooting
- Find customer assets: Retrieve the customer’s registered devices or products.
- Troubleshoot tech problems: Provide technical assistance or diagnostics.
Topic 3: Exchange: Describes actions related to order exchange and return.
- Return an order: Launch order return process.
- Check product exchange policies: Provide exchange rules for products.
- Find customer orders: Retrieve all orders placed by a customer.
- Find customer order terms: retrieve all terms related to an order
Topic 4: Product Support : Describes cross-cutting actions used for information retrieval and routing.
- Answer questions with knowledge: Respond to general customer inquiries using knowledge base information.
- Check promotions: View current promotions or discounts.

To recap, first create a comprehensive list of all possible actions, then mark semantic overlap between these actions. Next, create a set of topics that, at a minimum, solves for all semantic overlap (so that the reasoning engine won’t get confused within the confines of one topic). Then write every topic’s classification description. If the topics are too broad in scope, break them up into more granular topics. By implementing this guidance, you can build an agent that not only performs well, but is also easy to maintain and extend.

This structure supports better reasoning, more accurate execution, and clearer decision boundaries within the agent’s behavior. It also relies on collaboration among designers, engineers, and subject-matter experts to make the agent’s capabilities more transparent and modular.

Further Considerations for Effective Topics Creation

Optimal Number of Topics: To improve the performance of LLMs in classifying the right topic, it's generally advisable not to exceed 10 topics. But this is only a rule of thumb. Additionally, each topic should have a clear and distinct description. Ultimately, the optimal number of topics depends on the semantic distance between topic classification descriptions. If the topics have classification descriptions that differ a lot, the risk of topic overlap is minimized. More best practices to avoid overlap between topics are described here .
Balance Topic Size and Clarity: While it's generally beneficial to have small topics in terms of actions and instructions for easier classification, creating too many topics with very similar descriptions can lead to confusion, just like semantic overlap between actions leads to misclassification. It is therefore important to have clearly semantically differentiated topic descriptions. You can use LLMs to help you create those well differentiated classifications.
Standard Language and Contextual Clarity: LLMs may not be aware of specific company or business meanings and abbreviations. Use standard language or be very explicit in explaining the meanings of words within your particular context.
Only the topic name and description are sent in the topic prompt. Therefore, changing the scope or instructions of the topic won’t have any impact on the topic selection.

Example in Action

Imagine a service agent receives a warranty policy request for a watch. The warranty issue does not appear to be related to product exchange or support. Order management seems to be the most appropriate topic to address this request. Hence the latter topic is chosen by the reasoning engine as the most probable one to fulfill the request.

Graphic showing the flow of Topic Classification from agent conversation to plan.

Reasoning Step 2: Action Selection

After topic selection, the reasoning engine selects the right actions to execute from the selected topic. Again, the reasoning engine LLM is responsible for this, using another prompt called the observation prompt. The purpose of the observation prompt is to obtain the next step in the reasoning process. This next step can be any of the following:

Launching an action
Requesting action inputs from the user
Requesting clarification of the request from the user
Sending the final answer to the user (fulfilling the user request) once the action-loop completed (see for more details reasoning step 3)

The input to the observation prompt is formed by all the descriptions of all actions from the topic, as well as from the conversation history.

Best Practices for Action Design

Actions are the predefined tasks the agent can perform to do its job. Actions are the most fine-grain definitions of work. There are five different types of agent actions: (1) execute Apex code, (2) call an API, (3) execute a flow, (4) get an LLM response to a prompt template, and (5) call a predictive model. Most of these actions can be deterministic. An exception is getting a response to a prompt template (provided that the external system, Flow or Apex action doesn’t contain probabilistic elements like prompt invocations). We will cover these issues in the fifth level of agentic control.

Controlling Agentic Behavior in Prompt Actions: There are two ways to enforce control over the agent’s behavior in prompt actions: (1) add more instructions to the prompt template through prompt engineering, and (2) configure the hyperparameters of the LLM that generates the prompt response, particularly the temperature (see documentation ). Lowering the temperature reduces variability in the generated responses, which increases their repeatability and agent reliability.
Optimal Number of Actions Inside a Single Topic: Similar to the maximum number of topics, the number of actions inside a topic should not exceed 10. But again, this is a rule of thumb. The real driving factor is the semantic distance between the actions. When actions are clearly distinguishable by their description, this number can be larger. There is no numeric measure for semantic distance however; this is up to the interpretation of the agent builder. The larger the difference in meaning between the action descriptions, the greater the semantic distance. Overlap should be avoided at all times here.
Action Descriptions: Contrary to what the name suggests, the “action instruction” actually serves as an instruction that the reasoning engine LLM uses to select the right action from the topic. Note that using semantically distinctive action descriptions can greatly improve the quality of their classifications. Carefully review these descriptions, and especially compare all action descriptions belonging to a single topic.

Example in Action

Let’s continue with the previous example in which a service agent received a question regarding the warranty policy for a watch. After selecting the Order Management topic, the most probable action is chosen. Because this is the order management topic, the first logical step is to look up the order (otherwise, what is there to retrieve warranty information for?) by launching the lookup order action.

Graphic showing the flow of classifying actions from an agent conversation to a plan.

Reasoning Step 3: The Agentic Loop

A user utterance can trigger the execution of multiple actions before an answer is sent back to the user. This is due to the agentic loop, which continues to select and execute the next most suitable action until one of the following conditions is met:

The reasoning engine LLM determines the request is complete. In this case, it finds that the user request is fulfilled by the response. Note that part of this check includes a grounding check, verifying that the response is grounded in the action outputs. No information in the response should be invented by the reasoning engine LLM, as that may lead to hallucination or otherwise incorrect responses.
No more suitable actions are found.
The maximum allowed LLM calls for the current step is reached. This number is set by the reasoning engine itself.

Actions are not subject to a specific timeout. This is to avoid interruption when action execution times vary based on their complexity. Some actions are simply more complex to execute than others.

Example in Action

After initiating an order lookup, the reasoning engine evaluates the response generated so far, and then decides it needs to do more work before an answer can be sent back to the user. It’s about to look up the warranty policy, now that the order is present in the conversation history.

Graphic showing the looping over next action classification in the flow from agent conversation to plan.

However, in doing so, the agent realizes that the customer has actually purchased two watches, as retrieved by the ‘order lookup’ action. Therefore, in the agentic loop, the reasoning engine now decides to ask the customer to specify which particular watch they need warranty information for.

Graphic showing the reasoning engine in action in the flow from an Agent conversation to plan.

Level 2 Agentic Control: Instructions

Agent reliability is enhanced by a careful distribution of actions across topics, and well-described actions and topics. However, these methods don’t allow for the expression of business rules, policies, and guardrails within the reasoning engine. Instructions provide an additional important layer of agentic control. Instructions offer further guidance to the reasoning engine when using various actions together. This allows for a more nuanced and policy-driven approach to agent behavior. With instructions, agent builders can ensure that agents not only function reliably, but also adhere to established business rules and best practices.

The instructions that are written at the topic level become part of the observation prompt. Topic instructions guide the reasoning engine in choosing appropriate actions. They can provide guidance on when to select which action, and they can be used to define action dependence. In certain circumstances, they can also enforce sequential control. However, alternatives exist for this and instructions should be used carefully for that requirement. Topic instructions are added one by one and appear in separate boxes in the UI. However, they are always sent to the observation prompt together. Adding instructions in separate boxes increases readability and maintainability of the topic, but it doesn’t affect the reasoning engine.

Sometimes, instructions apply globally to the agent and are not related to an individual topic. Functionality to maintain global instructions is currently on the product roadmap. Best practices for topic instruction writing can be found in the Agentforce Guide to Topics, Instructions, and Actions. Let’s review some additional guidelines.

Best Practices for Instruction Writing

Avoid Overscripting

Avoid overly scripting the ways in which agents should have conversations with users. Overscripting can stifle an agent's ability to build rapport, understand unique user needs, and respond effectively in real-time to dynamic circumstances. Moreover, lengthy instructions can make the agent slower to respond and confuse the reasoning engine. Forcing determinism via instructions is not the preferred approach.

What Not To Do

For example, there is no need to tell an agent to avoid referring to competitors in service answers. This can lead to undesired behaviour, as the agent can also refuse to answer questions referring to integration with a provider who is also a competitor. Instead, the instruction can be something like “When discussing competitors, respond with the company’s best interest in mind.” This avoids restrictive, conditional instructions such as “Only mention competitor xyz in the case of…”, and instead exploits the reasoning capabilities of the LLM. This example shows how instructions can be given on a higher, more abstract level, similar to the way a human service employee would be trained after having joined the company.

Let’s look at some more examples of what not to do. These are some bad instructions given to a service agent handling candidate profiles on a recruitment website. These instructions attempt to anticipate every possible customer utterance and should therefore be avoided:

Instruction 1: The agent receives the following utterance: “Can I add a picture to my profile?" then immediately ask the customer, "What is your profile type?"

Instruction 2: If the customer indicates a premium profile then answer “Let me check your contract details” then Search for contract details and check if it was agreed that they can update the profile picture.

If it was agreed that the candidate can do it, then answer “yes, it is possible let me update for you. Can you provide your new picture?” Once the picture is received, then update the candidate’s profile accordingly. If the contract does not include profile picture changes, then say “Sorry, this is not possible. Let me route you to a human agent”

Instruction 3: Non-Premium Profile: If the customer indicates a non-premium profile, respond with, "You cannot update your picture. If you would like to do so, please let me know and I will transfer you to a human agent."

Instruction 4: If the profile type is unclear, respond with, "I could not understand your profile type."

What To Do Instead

Instead of this type of micromanagement, use a more flexible approach that instructs agent behavior and conduct. Consider the following best practices:

Use RAG / knowledge actions for policies and rules that are contained in knowledge articles (instead of writing them as instructions). The relevant information is retrieved at the right time. For the above example, that means that a knowledge article entitled "Picture update" should state: "Only candidates with a premium profile whose contract allows for picture changes can update their picture."
Outline the main guidelines and guardrails individually, independent of the conversation. Provide agents with a clear and concise explanation of the expected behavior or procedure in question.

Based on these best practices, a better set of instructions might look like this:

Instruction 1: “Use knowledge actions to check for policies in case of requests for account changes.”

Instruction 2: “Do not respond to questions for which no applicable policy could be found.”

Applying the above guidelines can improve the agent results. Now, if the customer asks the agent for a profile change, the agent will understand that it needs to retrieve the required policy from the knowledge base, interpret the retrieved rules, apply those rules to the context, and finally respond to the customer. In contrast to overscripting, this behavioral approach is much more generic and widely applicable. Without having to write out each possible conversation, the agent can now flexibly respond with the desired behavior to a broader range of conversation topics.

Enforce Action Sequence (Not Applicable for Agent Script)

The observation prompt includes instructions and action descriptions, but without a defined order. If the sequence of actions is critical, it must be explicitly stated within the same instruction. Please note that with Agent Script, we can enforce an order of executions thanks to transitions. This will be explained further in the sixth chapter.

Let’s continue with the example of recruitment website agents. The agent should be able to handle interview planning with the appropriate interviewer. To do so, the agent should first check for the availability of the recruiters, and then propose three possible slots to the candidate.

In this case, in order to keep the order of execution, the instructions should not be in separate boxes:

Instruction 1: Check for interviewers availabilities.
Instruction 2: Then propose appropriate slots to the candidate.

These instructions don’t work because the reasoning engine does not know what the “Then” statement in instruction 2 refers to. This is because the instructions are sent to the reasoning engine as a group, not in any particular order.

Instead, sequence-defining instructions should be combined into one statement and written as:

Instruction 1: Check for interviewers' availability. Then propose appropriate slots to the candidate.

Enforce Action Output Without Rewriting

The reasoning engine is itself an LLM. It’s responsible for generating the final answer according to the agentic loop. The approach is needed to enforce agent instructions that provide guardrails to response generation, or to combine outputs from multiple actions that were part of the agentic loop, to fulfill the user request.

However, when the expectation is that only one prompt action has been executed, an instruction can be implemented to instruct the agent to never change the output of an action. Doing so ensures more predictable and reliable agent behavior.

Enforcing this strict adherence in approved prompt templates becomes crucial in certain scenarios, particularly when consistency, compliance, and pre-defined messaging are important. These are two examples:

Regulated Industries: Organizations operating in highly regulated sectors (such as finance, healthcare, or legal) often require strict control over all customer-facing communications. Approved prompt templates ensure that responses comply with legal and regulatory requirements, minimizing the risk of misinterpretation, liability, or dissemination of inaccurate information.
Pre-tested and Validated Responses: When prompt templates have undergone rigorous testing and validation to ensure accuracy, effectiveness, and desired outcomes. Deviating from these templates can undermine their efficacy and value. Strict adherence guarantees that the proven messaging is consistently delivered.

This instruction limits the agent's freedom to change the output of actions. Make sure that the instruction references the output of the prompt template (such as “promptResponse”), as shown in this Plan Tracer.

Salesforce UI showcasing plan tracing within Agent reasoning.

So the instruction in this case can be:

“Do not change promptResponse output, regardless of the channel of the agent.”

Limitations on Enforcing Strict Adherence:

When an interaction requires multiple distinct agent actions, enforcing strict adherence to a single template isn’t feasible. In fact, in this scenario, the reasoning engine needs to consolidate these actions into a single response, and therefore change every single action output.

Optimal Number of Instructions

Based on general LLM characteristics, the target number of instructions ranges between 5 and 10, depending on the instruction complexity and instruction interaction. These instruction characteristics influence the number of instructions the reasoning engine can follow:

Clarity and specificity: Well-defined rules are easier to follow.
Conflicts between rules: If rules contradict each other, then additional logic is required to resolve them.
Length and complexity: If each rule requires deep reasoning, consider breaking them down into smaller instructions.

If an instruction is very important to follow explicitly, then add terms that reflect their importance:

Urgency and importance (immediate, urgent, critical, essential, mandatory)
Authority and enforcement (required, compulsory, strictly enforced)
Consequences and warnings (Violation will result in, failure to comply will lead to, non-compliance may result in, strict penalties apply, zero tolerance)
Clarity and directness (must, prohibited, forbidden, not allowed, always/never)

Level 3 Agentic Control: Grounding

Grounding answers in data significantly improves agent reliability and trustworthiness. Grounded responses are based on factual information rather than speculation or outdated knowledge. Retri eval augmented generation (RAG) is a widely adopted technique that allows an agent to access and use a knowledge base in order to formulate more accurate and contextually relevant answers. Based on a user’s query, an agent uses RAG to retrieve relevant information from applicable data sources, and then augments the prompt with this information before submitting it to the LLM. An agent using RAG has higher quality, accuracy, and overall utility of agent interactions, which increases user confidence and satisfaction. Best practices for RAG are extensively described in a publicly available white paper called Agentforce and RAG: best practices for better agents.

Knowledge Versus Instructions

Differentiating between knowledge and instructions is important when striking the right balance between guidance and flexibility, as they fulfill different purposes:

Knowledge: Think of knowledge as the library of books the agent has access to while generating its answers. Examples include documents, knowledge articles, and white papers. Note that these may include policies and general company rules. Knowledge may also refer to transactional files, such as emails, call transcripts, and even the history of (agent) conversations. Finally, knowledge includes long-text fields in structured data. Knowledge is typically brought to the agent through RAG.
Instructions: Think of instructions as the minimum set of rules that clarify to the agent when to use each action. Instructions may also place guardrails on the entire conversation, such as the required tone of voice. Often, instructions can be made more concise and flexible without sacrificing clarity or intent. Instead of providing a rigid script with specific responses for every possible customer scenario, consider implementing general guidelines and principles that help the agent select the right action(s) in a variety of situations.

Retrieval Augmented Generation

Retrieval-Augmented Generation (RAG) acts as an intelligent data layer for knowledge. It gives agents access to information across various formats and provides relevant text fragments for answering questions. With RAG, agents can get more accurate LLM responses without overwhelming the LLM prompt with extraneous content or exceeding its context window.

At run time, RAG executes three steps:

Retrieval: The AI system searches a large database or knowledge source to gather relevant information for the LLM prompt. This is done using semantic search, a more sophisticated technique compared to traditional keyword-based search. Unlike keyword search, which matches exact terms, semantic search understands the meaning or context behind the words. It identifies relevant information based on the concepts or relationships between terms, rather than just looking for precise word matches. Keyword search can play a role in this retrieval process too, strengthening the semantic search with keyword matching for specific terminology or names. This search type is called hybrid search.
Augmentation: The retrieved information is added to the prompt.
Generation: The LLM generates a contextually appropriate and more accurate response thanks to the prompt that was augmented with retrieved knowledge.

In Agentforce, RAG can be used with or without a prompt template:

Prompt-based RAG: In this approach, the instructions that specify how to generate the response are in the prompt instructions in a prompt template. In this case, the response depends entirely on what the LLM generates. Aside from the prompt instructions, there are ways to influence the response, such as LLM configuration settings in Einstein Studio, but the result is still not deterministic.
Reasoning Engine-based RAG: Instead of using a prompt template, the agent uses an action that retrieves chunks (via a flow or Apex class) and stores them in a variable (see the next section). In this approach, the reasoning engine (instead of the LLM) generates answers directly, grounded in the retrieved data. The instructions that govern response generation are agent instructions rather than prompt template instructions. The variable that holds the retrieved content can still be passed explicitly as input to an action. It can also be given to the reasoning engine by granting it default access to the variable’s content. This approach has tradeoffs. There’s a potential risk of overloading the reasoning engine with content and responsibility. Moreover, unlike a prompt, the parameters of the LLM of the reasoning engine are not configurable. On the other hand, the reasoning engine can generate its answer using both the retrieved chunks as well as the history of the conversation.

The recommended method is option 1. It reduces the number of tasks the reasoning engine should perform, and thereby improves its answer quality. The next section explores an exception to this rule, in which the content is preserved throughout the conversation and hence is given to an action explicitly.

Flowchart graphic showing an Agent flow with RAG between Platform and Data 360.

RAG Best Practices

Avoid scripted RAG instructions: Instead of directly linking instructions to specific articles for particular questions, leverage RAG's intelligence to dynamically find the most relevant data source and precise text fragment. RAG's matching process is based on a broader understanding of the question, not just exact question-to-source mapping.
Consolidate topics: Group related question categories under a single topic. RAG can effectively identify relevant answers based on the question type, even within a broader topic. For example, product issues like maintenance and battery problems can be aggregated into a more comprehensive topic.
Store RAG output in a variable: When the number of interaction limits might be reached, store the RAG output in a variable. This keeps the information accessible for guiding agent interactions beyond the standard threshold. An example of this will be provided in the next section.

Level 4 Agentic Control: Variables

Certain business processes demand even more predictable execution, such as enforcing a specific action sequence or conditions for triggering actions or topics.

To achieve this deterministic behavior, variables can be used. Variables function as a structured form of short-term agent memory that can serve as action inputs or outputs. Furthermore, the state of a variable can govern the triggering of specific topics and actions.

Ways that Variables Support Determination

Variables can help achieve guided determinism of agents in the following ways:

Persistent dynamic grounding: Variables allow agents to continuously update their understanding of the world while persisting important information that remains unaffected by any subsequent interactions. This method ensures that critical information, which can be unstructured data retrieved via RAG, or structured data like user profile information, is maintained throughout the conversation, independent of the conversation length.
Action inputs/outputs: Variables can be used as both input and output for actions. The action explicitly refers to variables, and execution of the action doesn’t rely on the reasoning engine for setting these inputs and outputs, which increases the determinism of the agent.
Filtering: Variables can be used to determine the conditional execution of actions or topics. Variables allow for a specific flow of information between actions and determinism in action execution. This capability is particularly crucial for security rules in which actions cannot be initiated if specific input variables, such as email, are not verified.

Types of Variables in Agentforce

Agentforce has two types of variables:

Context variables are system-generated variables that hold information about the user and the conversation sessions.
Custom variables can be instantiated by the user. They hold any kind of information used for any of the three ways variables support determinism.

Variable types support the following capabilities:

	Context Variables	Custom Variables
Can be instantiated by user	X	✓
Can be Input of Actions	✓	✓
Can be output of Actions	X	✓
Can be updated by actions	X	✓
Can be used in filters of actions and topics	✓	✓

Example Use Case: Troubleshooting Agent

Let’s dig into variables further with an example use case: a customer-facing troubleshooting agent. In this example, variables are used for all three purposes: persisted dynamic grounding, action inputs/outputs, and filtering.

In this example, the agent helps a customer troubleshoot a technical device issue. Troubleshooting typically involves going through a number of steps. The agent should offer a service experience that mimics the work of a human service agent. To do so, the agent shouldn’t provide all the troubleshooting steps at once to the customer. Instead, it should offer step-by-step instructions, along with the ability to navigate between steps (including going back to previously covered steps) based on how the customer responds.

One challenge with this is the agent’s capacity to retain all troubleshooting steps throughout the conversation. Given the agent’s limited memory due to the restricted number of interactions it can store, these steps can be dropped from the context for the reasoning engine if the conversation becomes lengthy.

The way to address this challenge is to use a variable to ground the reasoning engine dynamically across troubleshooting steps. By retrieving the information and storing it in a variable, it remains available, and can be updated, throughout the conversation. The reasoning engine uses the information stored in this variable for dynamic grounding.

Retrieving, Setting, and Using the Troubleshooting Steps

In this example, a topic includes two actions. These two actions are needed to maintain a consistent data flow. The first action is used to populate the variable that contains all of the troubleshooting steps. The second action uses that variable during the troubleshooting itself.

Action 1: “Populate issue resolution steps”. This is the initial action, triggered by the agent’s first introduction to the issue. It uses Retrieval Augmented Generation to extract all the necessary resolution steps from an indexed knowledge base. The action stores the resulting output in a variable named ‘Resolution Steps’.
Action 2: “Use in the middle of solving an issue”. This is an action based on a prompt that outputs the most likely next troubleshooting step to be used during the troubleshooting process. The agent is instructed to use this action while in the midst of solving an issue.

The original customer question is input to both actions. The second action has another input: the contents of the variable ‘Resolution Steps’. This variable was set by the first action. Note that the second action won’t retrieve the troubleshooting steps itself, but will instead get them as input from the first action via the variable. The following diagram depicts the data flow between those two actions.

Flowchart graphic showing the retrieving, setting, and using stages of troubleshooting.

The ‘Use in the Middle of Solving an Issue’ action will always refer to the original troubleshooting steps retrieved by the Issue Resolution Steps action. This data flow ensures that troubleshooting steps are maintained coherently and always present, independent of the conversation length.

Using Filters to Ensure the Order of Execution of Actions

To execute the actions defined in this example, specific instructions are needed, such as “Always execute the ‘Populate resolution steps’ first.” However, given the non-deterministic nature of LLMs used by agents, this can lead to a different order in certain cases. To ensure a deterministic order of execution, we introduce conditional filters on these variables to enforce the proper action sequence. The agent reads the value of the variable ‘Resolution Steps’ and defines two filters based on whether this variable has a value or not.

Issue Resolution Steps action can only be executed if the ‘Resolution Steps’ variable is empty.
‘Use in the Middle of Solving an Issue Action’ can be executed only if the ‘Resolution Steps’ variable is filled.

These conditional filters now deterministically enforce the sequence of action execution: ‘Use in the Middle of Solving an Issue’ must wait until ‘Issue Resolution Steps’ completes its work, thereby guaranteeing that the‘Resolution Steps’ variable always has a value.

To ensure correct action execution, a third action is needed to reset the ‘Resolution Steps’ variable if the issue is fully solved. As a result, the agent is reset into the required state to help with a possible new, different issue. This third action is called ‘Empty the Resolution Variable’. The full action diagram is depicted below.

Flowchart graphic showcasing an Agent using filters for troubleshooting or providing resolution.

Variables are crucial to enabling our troubleshooting agent to solve customer problems by allowing for:

Persistent dynamic grounding: Variables store troubleshooting steps, ensuring that they are available throughout the conversation, regardless of length or number of interactions. This prevents the agent from forgetting context.
Data flow: Variables facilitate data flow between actions. For example, the ‘Resolution Steps’ variable stores the retrieved troubleshooting steps from the Issue Resolution Steps action and provides them as input to the Use in the Middle of Solving an Issue action.
Determinism: Variables can be used as filters to enforce a specific order of action execution. For example, the Use in the Middle of Solving an Issue action executes only if the Resolution Step variable is filled, ensuring the Issue Resolution Steps action runs first.

Variables to Persist Predictive Model Output

In the era of generative AI, predictive AI remains critically important in that it forms the foundational intelligence that guides, enhances, and contextualizes generative capabilities. While generative AI focuses on creating new content — such as text, images or videos — predictive models make predictions about the future based on inputs from real-time business data. Example business outcomes include customer likelihood to churn, conversion likelihoods, case escalation probability, customer lifetime value, and case classification. Predictions can help anticipate user needs, personalize outputs, enact decisions, optimize content relevance in real time - all by analyzing trends and numbers. For example, in applications like personalized learning, healthcare, or financial planning, predictive AI ensures generative outputs align with individual contexts and likely future scenarios. Together, predictive and generative AI create a powerful synergy, merging foresight with creativity to drive more intelligent, adaptive, and impactful technology solutions.

How to Integrate Predictive Model Output Into Agent Workflows

To incorporate predictive model outputs into agent workflows, simply add predictive model actions to the Agentforce assets. Model Builder provides the means to build or register (BYO) predictive models, and these models are then used by the agent to make predictions. The resulting predictions (as well as predictors) can be stored in custom variables. Agents can use these variable values as inputs to, and conditionalize the execution of, specific actions and topics.

Use Case Examples Integrated With Predictive Models

Segmentation: Perform multi-class classification for customer segmentation and use the resulting segment to filter certain actions. For example, reserve premium actions for premium-tier customers.
Escalation likelihood: Predict the likelihood of escalation for a service case. If this likelihood exceeds a certain threshold, allow execution of actions that resolve the case quicker or escalate to human agents faster.
CPG: Plan a promotion only if the uplift in sales (score calculated by a predictive model) exceeds a certain threshold.
Commerce: Propose a product only if the propensity to buy exceeds a certain threshold.

Level 5 Agentic Control: Deterministic Actions

Certain business processes need to be executed in a precise order and do not require user input during the execution. In this case, a predetermined flow of steps can be enforced via flows, APIs, or Apex. If you have an existing flow that’s relied on in production, it’s a good indication that it can be kept and used by the agent for the execution of that business process. All of the following examples include predetermined sequences of steps that the agent can execute without user input. The agentic behavior in this case consists of identifying which deterministic process to execute, how to gather the necessary inputs, and how to interpret and process the outputs.

Business processes with many sequential steps (more than three, as a rule of thumb) and many dependencies on variables become too complex and cumbersome to enforce with instructions. In this case, it is possible to simply hardcode them using the deterministic action types listed in this section. Finally, note that these implementations can include non-deterministic elements, such as calling LLMs with resolved prompt templates. Therefore, they are not necessarily completely deterministic, end-to-end, and they can still demonstrate the desired levels of fluidity that agents are known for.

Flowchart graphic showing a marketing journey.

The sequence of steps in a marketing journey is conditioned by fixed rules and does not depend on any conversational user input. Therefore, the flow can be used as an Agentforce action. An invocable action can be created to complete background or event-triggered tasks from a solution component that can call a flow or Apex class. Add an invocable action to a flow or Apex class and specify the task that the agent completes, as well as the conditions that trigger the agent. Invocable actions can also carry the context variables of the agent and pass along important information.

Flows

Salesforce flows can be used to automate routine tasks, like creating follow-up tasks, sending reminder emails, or updating records. Flows make work more efficient and productive. Agents can also execute flows using flow actions. Because of their determinism, flows are a great way to direct agentic behavior when a business process needs to be executed in a particular sequence. A good indication that a flow action is preferred is when the topic would otherwise contain instructions such as “First do this, then do this, and finally do this”. Enforcing sequences of more than three steps becomes cumbersome to manage via instructions and variables.

Flows can also include non-deterministic elements by calling prompts. A prompt node in flow invokes a prompt template and collects the response that can be passed on to other elements in the flow. These further elements can again be prompt nodes, for example summarizing the previous response, thereby creating a chain of prompts. This is particularly useful when the rules for prompt chaining are defined by fixed elements and don’t depend on user input. One example is agentic RAG in which a predefined sequence of retrievers or prompts in a flow can access specific data sources in a particular order, such as initially retrieving data from a user's country document before consulting other sources as needed. This chaining mechanism enforces a reliable and ordered extraction of relevant information.

Apex and API actions

Similar to flows, Apex and API actions are deterministic in that a predefined sequence of actions can be coded. These actions can include non-deterministic elements, such as invoking prompt templates or LLM calls. However, in their definition, they execute these steps deterministically, which reduces agentic variability by calling the action at the right time, collecting the necessary input, and processing the output. These responsibilities still need to be governed by agentic instructions, so they’re non-deterministic. Apex and API actions are the pro-code equivalent of flow actions.

Level 6 Agentic Control: Deterministic Control with Agent Script

From Probabilistic to Deterministic Reasoning

In determinism levels 1 through 5, we progressively added structure to the agent's environment. We defined what it could do (level 1, topics), then guided how it should behave (level 2, instructions), we grounded it in truth (level 3, data), we managed its state (level 4, variables) and gave it rigid tools (level 5, deterministic actions). However, the central decision-making engine remained fundamentally probabilistic. The reasoning engine was still deciding itself which tool to pick or which question to ask next, and for this decision we fully relied on the Large Language Model (LLM).

Level 6, Agent Script, fundamentally changes this architecture. It introduces the capability to hard-code the reasoning process itself.

With Agent Script, we move from prompting the model to scripting the agent. This is not a return to rigid, old-school chatbots. Instead, we refer to it as hybrid reasoning. It allows you to sandwich the creative, conversational power of the LLM between layers of immutable, deterministic logic. You explicitly define the critical path of execution (the "must-dos") while leaving specific pockets of freedom for the LLM to handle natural language understanding and response generation.

When designing workflows, it is crucial to avoid using LLM-based agents using scripts, only to replace deterministic logic where the next steps are already clear and fixed. If a process follows a predictable path with no need for complex reasoning to decode subsequent actions, introducing a generative model adds unnecessary latency, cost, and a margin for error. Traditional programmatic flows remain superior for processes that are purely deterministic and don't need reasoning. Utilizing an LLM for simple routing or linear transitions is an over-engineering choice that compromises the reliability of a system that could otherwise be handled by a standard procedural flow.

As a rule of thumb, agentic solutions should be considered when the system is dealing with unstructured input that must be synthesized from disparate, high-variance sources (possibly including conversational input) before a decision can be made.

But how do you achieve this level of control? There are two distinct paths, designed for both the business architect and the pro-code developer.

Two Ways to Get Agent Script to Work

Bringing level 6 determinism to your agent does not strictly require writing code. Salesforce provides two modalities to generate the underlying Agent Script, democratizing access to deterministic reasoning.

1. The Builder Path (Natural Language Compilation)

For business analysts, admins, and low-code practitioners, level 6 is accessible directly within the Agent Builder.

We have introduced a document-style canvas that serves as a text-to-script interface. Instead of writing code, you write the topic’s logic in structured, natural language. The builder then interprets your intent and compiles it into Agent Script in the background.

You write: "First, check if the order is older than 30 days. If it is, tell the user we cannot accept returns and politely end the conversation. If it is not, ask for the condition of the item."
The system compiles this natural language narrative automatically into the necessary if/else structures, variable checks, and end_conversation commands.

This allows you to write the logic in natural language, and have the platform convert it into code, ensuring that even non-programmers can enforce rigid guardrails and determinism.

2. The Code-First Path (Direct Scripting)

For developers seeking maximum precision, you can write Agent Script directly within agent builder as well. The canvas with the natural language narrative can be flipped over to see the underlying script, so that the developer can now code the script directly. This approach even allows for a hybrid authoring experience: some instructions are written in natural language on the canvas, while others are scripted (or existing ones modified) directly using code. Flipping back and forth between these two experiences you will see that the two modalities are always kept perfectly aligned.

Both modalities unlock the full potential of level 6:

Detailed tracing: you can step through the script execution to see exactly where variables changed or branches were taken.
Complex loop handling: manage sophisticated retry logic or multi-variable state changes that are difficult to describe in natural language.
Version control: treat agent behavior as code, compatible with git and CI/CD pipelines for agent versioning.

The Mechanics of Agent Script

Whether you generate Agent Script via the builder or write it by hand, it results in the same output: Agent Script is converted into an agent graph that is executed by the Atlas Reasoning Engine.

To master level 6, one must understand what happens under the hood. Agent Script controls behavior through specific programmatic structures within reasoning blocks. Unlike standard prompts, which are suggestions for the LLM to follow, these are commands that will be executed no matter what. That is either before, during, or after the reasoning process, and they come in several distinct types of determinism. We will first review some of these patterns generally and provide some light-weight examples, and then illustrate them further with examples of architectural blueprints of scripted agents.

1. Before- and After-Reasoning Determinism

In levels 1-5, we hoped the agent would do something (perform an action) before or after a certain step in the process. In level 6, we force it to. Whatever is written in before_reasoning and after_reasoning blocks will always be executed, respectively before and after invoking the LLM to reason based on instructions. This can be running other actions, transitioning to topics, setting variables, and so on.

For example by using the run command inside a topic’s before_reasoning instructions, you can execute an action even before the LLM is invoked to generate a response. This guarantees that the data is available immediately, eliminating reasoning latency or the risk of the agent forgetting to call the tool.

The script structure:



reasoning:
  instructions: ->
    before_reasoning :  
       # Deterministic: This runs automatically upon topic entry.
       # The LLM has no choice here. It simply receives the output.
    instructions
       # Now, the LLM is prompted with the result already in context
       | You are speaking to a customer. Their VIP status is {!@variables.is_vip}.
       # any further instructions (normal reasoning) go next
      Whatever instructions the agent needs for reasoning.

2. Conditional Determinism (if/else)

With conditional determinism, you can use standard programming logic to control the flow. This is critical for compliance workflows where steps cannot be skipped or reimagined.

The script structure:



reasoning:
  instructions: ->
     if @variables.is_vip == true:
        # Skip credit check for VIPs deterministically
        run @actions.apply_auto_approval
        | Inform the customer their loan is auto-approved due to VIP status.
    else:
        # Enforce credit check for everyone else
        run @actions.initiate_credit_check
        | Tell the customer we are checking their credit score now.

In this example, the LLM is never given the option to hallucinate an approval for a non-VIP user. The branch is taken deterministically by the engine.

3. Transition Determinism (@utils.transition)

Another powerful control is the ability to force the agent out of the current topic and into another. This prevents the agent from getting stuck or drifting into unrelated conversation.

The Script Structure:


 if @variables.stock_level == 0:
        # Immediately hand off to the "Backorder" topic
        @utils.transition to @topic.handle_backorder

This transition isn't a suggestion. It's a hard redirect of the execution flow that is dependent on the value of a variable. Note that while the redirect is hard and non-negotiable, a normal reasoning process takes place again within the topic that the agent is now forced to.

Moreover, with Agent Script, you have the ability to explicitly force a transition from one action to the next immediately upon completion. This functionality ensures that the agent follows a rigid, deterministic flow rather than relying on probabilistic or autonomous decision-making at every step. By chaining these actions together in a predefined sequence, you can guarantee that specific tasks are executed in a precise order, providing total control over the agent's logic and behavior.

4. Variable State Management Determinism

Agent Script gives you direct read/write access to the agent's short-term memory (variables). You can explicitly set variables based on action outputs, preventing a telephone game where an LLM might misinterpret a tool’s JSON output.

The script structure:


   # Explicitly binding an action's output to a variable
    run @actions.check_inventory with sku=@variables.current_sku
    set @variables.stock_level = @outputs.quantity_available

Architectural Blueprints: Examples of Agent Script in Practice

To truly understand the power of Agent Script, we must look beyond individual commands and observe them in concert. The following architectural patterns (derived from our Agent Script recipes collection ) demonstrate how level 6 determinism solves complex enterprise challenges.

1. Dynamic Context: "Zero-Latency" Dynamic Knowledge Injection

The problem: standard agents often suffer from reasoning latency. They wait for the user to ask a question, then think about which tool to use, then they may even ask the user information that could already have been retrieved, and only then call the action. This creates a laggy, disjointed experience. The Agent Script: Pre-Reasoning Determinism.

We use the run command to inject data before the LLM even wakes up.

Example: the emergency triage agent. Imagine an agent handling a power outage report. Instead of asking the user for their address and waiting, the moment the session starts, the script automatically runs a get_current_location_by_IP and check_grid_status command.

The result: the agent doesn't start with "How can I help you?" It starts with: "I see you're calling from the north sector where there is a confirmed transformer fire. I’ve already added your household to the priority restoration list. Do you have a backup generator running?"

The Logic:


 reasoning:
  instructions: ->
    run @actions.get_incident_status with zip=@user.zip
    set @variables.is_outage = @outputs.active_incident
    | If {!@variables.is_outage}, acknowledge the specific incident immediately.

2. Conditional Grounding

Lengthy prompts (giving the agent all the rules at once) lead to increased probability of hallucination in the reasoning process. The agent forgets rule A because it's looking at rule Z.

The Agent Script solution: just-in-time instruction injection with conditional grounding through a combination of RAG and conditional logic. It only shows the agent the rules that apply to the exact moment of the conversation.

Example: providing rules for non-eligible offers. Why give the agent the rules about requesting credit increases, when the customer’s credit score doesn’t even allow for it to begin with?

The logic:


 if @variables.credit_score < 600:
   # The agent is physically blinded to the "Increase Credit" instructions. 
   # It only sees "Debt Counseling" instructions that are fetched through RAG
   | Focus solely on explaining credit repair resources. Insert $Debt_Counseling_Retriever.results
 else:
   | You are authorized to offer a limit increase up to $5k.

Conditional grounding is removing the possibility of error by removing the distracting information entirely when it isn’t needed.

3. Guided Conversation

The problem: in a complex agent conversation (like a mortgage application, job screening interview, or a technical troubleshooting session), the agent maintains a list of must-answer questions so that all the necessary information is captured from the user. However, users often go on tangents. A standard agent might follow the tangent and forget to come back to these must-have questions, leaving the application or conversation incomplete.

At the heart of this system is stateful navigation, which treats the conversation like a rigorous checklist that must be completely checked off before any transition can occur.

Through stateful navigation, the agent moves between topics based strictly on the current variable state or remains locked within a topic until specific conditions are met. This prevents the agent from following inadmissible paths, even when a user attempts to derail the conversation with tangents. For example, in a high-stakes mortgage application, if a user asks for branch opening hours, the agent doesn't just try to stay on track. Instead, the script detects the deviation and can trigger a forced transition to a compliance-reset topic. By locking the agent into a specific "logic room," it becomes mathematically impossible for it to discuss unapproved topics or exit a session until every must-have variable has been successfully captured.

Example: The maintenance inspector; an agent is guiding a technician through a 10-point safety check on an aircraft engine. The technician says, "Wait, I noticed a scratch on the fuselage, too."

The behavior:

The agent acknowledges the scratch (Natural Language).
It logs the scratch to a variable (State Management).
It refuses to close the session or switch topics until it confirms: "I've noted the fuselage scratch, but we cannot move to 'Exterior' until you confirm the torque setting on the Intake Valve. What was that reading?"

The Logic:


 if @variables.safety_check_complete == false:
   # Prevent the user from ending the topic
   | Acknowledge the user's side-note, then pivot back to the required field: 
{!@variables.missing_field}.
   @utils.stay_in_topic

A guided conversation should be more than just a sequential list of questions, though. Otherwise the agent is more like a “form in disguise.” Its primary value lies in intelligent triaging: using initial discovery questions to route the user to the correct form or workflow.

The transition from a simple script to a robust Agent Script becomes logical when the complexity increases. Instead of just asking, the agent starts doing: For example, the agent might extract troubleshooting steps from documentation, navigate internal policies, or execute API calls to external systems to resolve issues in real-time.

Choosing Between Guided Autonomy and Scripted Precision

With the introduction of Agent Script as level 6 in the levels of determinism framework, you now have a complete spectrum of control, from the open-ended creativity of a level 1 topic to the rigid, code-driven logic of a level 6 Agent Script.

But having a hammer does not make every problem a nail.

The most common mistake to make is to think that "higher is better", and agents should now be fully controlled using script for all matters. This is not true. The real art of agent architecture and design lies in right-sizing the determinism by applying exactly enough control to ensure safety, without sacrificing the conversational flexibility that makes AI valuable in the first place. Do not overscript your agents down to the level that they become glorified chatbots.

This chapter provides a decision framework to help you determine when to rely on the guided autonomy of levels 1–5 and when to enforce the scripted precision of level 6. These aren’t strict laws, but rather rules of thumb, and meant to provide a contextual framework of how to think about the different options and levels for determinism.

To simplify the decision, we can divide the six levels into two distinct strategic zones:

Zone A: Guided Autonomy of Levels 1–5

The philosophy: "trust, but verify." You give the agent the goal, the data, and the tools (which may be deterministic, see level 5), but you let the reasoning engine decide the best path to get there.
The mechanism: probabilistic reasoning. The agent analyzes the user's intent and dynamically selects the best tool for the job.
Best for: discovery, general Q&A, low-risk tasks, broad service scopes.

You should rely on the standard, probabilistic behaviors of Levels 1 through 5 when:

1. The Right Path Is Not Always the Same

In many conversational scenarios, a rigid, hard-coded path is actually a disadvantage because the correct conversational path is variable. For dynamic interactions like personal shopping or vacation planning, there is no single correct sequence; a user might prioritize price, location, or amenities in any order they choose. In these cases, forcing a stateful script creates a frustratingly robotic experience, so it is more effective to rely on Instructions to define the agent’s persona and goals while allowing the user to lead the flow. This approach also significantly increases speed to market, as building complex level 6 scripts with nested variables and branches is often overkill for tasks like internal HR FAQ agents. By grounding the agent in Data and RAG, you can bypass the need for an exhaustive manual script and let the retrieval engine handle the conversation dynamically based on your existing knowledge base.

2. Scaling Through Modular Determinism: Avoiding the Maintenance Nightmare

When the scope of your agent reaches a massive scale, such as handling 500 different IT support queries with their own processes, the primary challenge isn't whether you can build a single deterministic agentic script, but whether you should. Attempting to map every possible permutation of 500 tasks into one giant Agent Script creates a maintenance nightmare. Every time a policy changes or a new troubleshooting step is added, you risk breaking a massive, interconnected web of logic.

The solution is to move away from a monolithic script and toward level-5 deterministic actions. Instead of scripting the entire conversation, you build robust, isolated flows for the top-tier, high-value actions, like password resets or account unlocking. You then allow the reasoning engine to act as a traffic controller, identifying the user's intent from their unique phrasing and routing them into the appropriate deterministic action. This approach gives you the best of both worlds: the reliability of a script for critical tasks and a flexible, scalable architecture that doesn't collapse under its own weight as your library of tasks grows.

Zone B: Scripted Precision with Level 6 Agent Script

The philosophy: "Whatever you do and reason, in any case, do exactly this." You define the path. The agent is the interface for executing your logic. It sandwiches the agent's creativity in layers of must-do logic.
The mechanism: deterministic reasoning. The "brain" follows a pre-compiled graph; the LLM is used only for reasoning, natural language understanding, and response generation where the script allows it to.
Best for: compliance, financial transactions, diagnostic trees, high-stakes workflows, compliance, and highly regulated industries.

Note that within the deterministic tracks that level 6 lays out, all options of level 1-5 are still available!

You should graduate to Agent Script when "mostly right" is not good enough.

1. The Must-Pass Gates (Security and Authentication)

If a user asks to transfer money, you cannot rely on the LLM to probably ask for authentication. You need a mathematical guarantee that the authentication flow runs before the transaction flow.

The Agent Script solution: use run @actions.verify_identity inside a before_reasoning block or at the very top of your script to force compliance before the LLM generates a single token.

2. Regulatory Compliance

In industries like healthcare or finance, agents often must read a disclaimer verbatim or ask questions in a legally mandated order.

The Agent Script solution: hard-code the disclosure.



# The LLM cannot summarize or "rewrite" this. It is forced to output it.
| "Disclaimer: I am an AI agent. I cannot provide financial advice."

3. Complex Multi-Step Dependencies and Mandatory Action Sequences

If step B requires the output of step A, and step C depends on a calculation from step B, relying on an LLM to pass these variables via prompt context in a telephone game is risky. Additionally, when execution of a certain action is strictly mandatory after another action, the sequence needs to be hard-wired.

The Agent Script solutions: use set @variables.x = @outputs.y to explicitly bind data between steps, ensuring perfect fidelity. Use run and @utils.transition statements to code the sequence.

4. Preventing Topic Drift

In high-stakes troubleshooting (e.g., "My pacemaker is beeping"), you don't want the agent to be distracted if the user suddenly asks, "What's the weather?"

The Agent Script solution: Use @utils.transition to lock the user into the Emergency Protocol topic until the issue is resolved, explicitly disabling the ability to drift.

The Hybrid Architecture: Best of Both Worlds

The most mature agent architectures don't choose one or the other; they use level 6 as the skeleton and levels 1–5 as the muscle. The pattern that then emerges is one of a deterministic sandwich. You can use Agent Script to handle the critical beginning and end of a conversation, while leaving the middle open for flexible reasoning.

Step 1 (Level 6): Agent Script forces triage and authentication.
- Result: the user is securely identified and intent is classified.
Step 2 (Levels 1-5): Agent Script handoffs to a standard topic.
- Result: the agent uses standard RAG and actions, instructions, and perhaps even variables to solve the user's problem flexibly.
Step 3 (Level 6): the agent detects the problem is solved and transitions back to Agent Script for the closing actions.
- Result: Agent Script forces the collection of CSAT scores and compliant goodbye language.

Summary Table: The Architect’s Cheat Sheet

Feature	Levels 1–5 (guided autonomy)	Level 6 (Agent Script)
Primary driver	Probabilistic engine (LLM decides)	Deterministic graph (Code decides)
Logic source	Natural language prompts	if/else Logic, state management, transition logic
Action execution	"Agent, here is a tool. Use it if you want."	"Agent, run this tool. Now."
Context memory	Implicit through LLM context window (except when using level 4)	Explicit through mutable variables used all throughout the script
Use case examples	Knowledge search, shopping, creative writing	Authentication, payments, compliance, diagnostics
Build effort	low (mainly prompting)	medium/high (scripting/logic)
Risk tolerance	medium	low (zero-trust)

Final recommendation: start with levels 1–5 for speed and discovery and monitor your logs. Where you see the agent struggling with consistency, failing to follow a sequence, or hallucinating parameters, selectively harden that specific workflow with level 6.

Conclusion

Agent Script is the final piece of the puzzle in bringing determinism to agents. It acknowledges that while AI is probabilistic, business is deterministic. By adopting Agent Script (whether through the canvas that supports agent building in natural language, or in direct code), you aren’t limiting the intelligence of your agent; you’re focusing it. You're creating a system where the art of conversation meets the science of process execution, ensuring that your most critical workflows run exactly as designed, every single time.

Level 6 is also the realization that autonomous does not mean uncontrolled.

For years, the industry has debated rules vs AI when it comes to decision making and process optimization. The camp of strict rules argued for predictability. The AI camp argued for flexibility. Agent Script ends this debate by proving that the correct architecture is not or, but and.

By adopting Agent Script, you’re building hybrid agents: systems that possess the rigid backbone of code and the flexible mind of an LLM. The future of enterprise AI is not about bigger models. It's about better control. With Agent Script, that control is finally in your hands.

AI Determinism FAQs

The six levels of determinism in AI are: instruction-free topic and action selection; agent instructions; data grounding; agent variables; deterministic actions using flows, Apex, and APIs; and agentic control with Agent Script.

Understanding AI determinism is crucial for building reliable agents that can perform critical business functions accurately and consistently, striking a balance between creative fluidity and enterprise control.

In AI, "deterministic" refers to the ability of a system to produce the same output given the same input and conditions, imposing a rigidity and discipline essential for reliable agent behavior.

Non-determinism in AI systems arises primarily due to the use of Large Language Models (LLMs), which are non-deterministic by nature, allowing agents to be flexible and adaptive.

The levels of determinism progressively enhance the determinism of AI agents, thereby affecting their autonomy - as the levels progress, agents become less autonomous but more reliable and aligned with business processes.

Less deterministic AI systems present challenges in terms of reliability and compliance with business requirements, as their inherent non-determinism can lead to unpredictable behavior.

Businesses manage AI systems with varying levels of determinism by applying a layered approach that includes thoughtful design, clear instructions, data grounding, state management through variables, and deterministic process automation using flows, Apex, and APIs.

Learn more about AI agents and how they can help your business.

Guide

Ready to take the next step with Agentforce?

Build agents fast.

Take a closer look at how agent building works in our library.

Watch demos

Get expert guidance.

Launch Agentforce with speed, confidence, and ROI you can measure.

See how

Talk to a rep.

Tell us about your business needs, and we’ll help you find answers.

Meet Agentforce 360

Agentforce

Sales

Service

Marketing

Commerce

Analytics

Slack

Small Business

Data

Agentforce 360 Platform

Net Zero

Customer Success

Partner Apps & Experts

Discover the #1 AI CRM

Discover the #1 AI CRM

Automotive

Communications

Engineering, Construction & Real Estate

Consumer Goods

Education

Energy & Utilities

Financial Services

Healthcare

Life Sciences

Manufacturing

Media

Nonprofit

Professional Services

Public Sector

Retail

Technology

Travel, Transportation & Hospitality

Explore Salesforce for industries.

Explore Salesforce for industries.

Customer Stories

Salesforce on Salesforce Stories

Trailblazer Stories

Explore success stories.

Explore success stories.

Dreamforce

TDX

Connections

Tableau Conference

Agentforce World Tours

Salesforce+

More Salesforce Events

Salesforce Events

Salesforce Events

Learning on Trailhead

Try Salesforce for Free

New to Salesforce

Blogs

Resources

Become a Trailblazer.

Become a Trailblazer.

Help & Documentation

Communities

Services & Plans

Account Management

Questions? We can help.

Questions? We can help.

About Salesforce

Our Values

Our Impact

Careers

Newsroom

Legal

More Salesforce Brands

Hear our story.

Hear our story.

Contact Us

By phone

Online

Change Region

Americas

Europe, Middle East, and Africa

Asia Pacific

Change Region

Americas