Agentforce and RAG: Best Practices for Better Agents

A best practices guide for powering Agentforce with unstructured data and long free-text fields

 SELECT v.Hybrid_score__c AS Score, c.Chunk__c AS Chunk, c.SourceRecordId__c AS SourceRecordId, c.DataSource__c AS DataSource, c.DataSourceObject__c AS DataSourceObject FROM hybrid_search(TABLE(KA_Agentforce_Default_Library_index__dlm), '{!$_SEARCH_STRING}', 'Language__c=''{!$_LANGUAGE}'' AND KnowledgePublicationStatus__c=''Online'' AND DataSource__c in (''FAQ_Internal_Comments_c__c'',''AssignmentNote__c'')', 30) v INNER JOIN KA_Agentforce_Default_Library_chunk__dlm c on c.RecordId__c=v.RecordId__c INNER JOIN ssot__KnowledgeArticleVersion__dlm kav on c.SourceRecordId__c=kav.ssot__Id__c ORDER BY Score DESC LIMIT 10

PLAIN, QUESTION, and METADATA chunks

Chunk Type Description
PLAIN Contain the original chunk text; raw content chunks directly from the original document.
QUESTION Contain questions that the chunk can answer.

Contain a set of LLM-generated questions. The associated plain chunk provides the answers to these questions. All generated questions are concatenated into a single chunk before vectorization. This minimizes the possible semantic mismatch between the user intent from the conversation (phrased as a question) and the context stored in the plain chunks (phrased as answers). Question chunks improve retrieval recall and precision, especially in Q&A-related agent scenarios. 

Although the vectors belonging to the question chunks are retrieved, the prompt augmentation automatically occurs using the corresponding plain chunks. Therefore, the questions themselves are never augmented to the prompt. 
METADATA Contain a set of LLM-generated metadata based on the plain chunk. These are the metadata generated during the indexing process:
- Keywords (up to 10) 
- Entities (key entities that occur in the chunk content)
- Topics (up to five main topics)
- Sentiment (positive/negative/neutral, as specified in the chunk)
- Title (concise and informative title)
- Summary (brief summary, typically between 100–250 words)

 
please answer this question:
{!$Input:question}

using this information:
{!$EinsteinSearch:ArticleRetriever_1Cx_Q8Qa1857028.results}
 

###
INSTRUCTIONS

 1. Analyze the query: Carefully read and understand the user’s question or issue from the QUESTION section.
 2. Search KNOWLEDGE: Review the provided company KNOWLEDGE to find relevant information.
 3. Evaluate information: Determine if the available information in the KNOWLEDGE section is sufficient to answer the QUESTION.
 4. Formulate response: To generate a reply 
<generated_response> to the user, you must follow these rules
 a. Find the article-chunk(s) most relevant to answer the user query and VERBATIM extract the ID of the article to set 
<source_id> field in the response JSON.
 If you are unable to find the relevant article, set 
<source_id> to NONE.
 b. Use the relevant article-chunk to generate the response that exactly answers the user’s question and set the 
<generated response> field.
 c. If the user request cannot be answered by the knowledge provided, set the 
<source_id> to NONE and 
<generated_response> to “Sorry, I can't find an answer based on the available articles.”
 5. Refine and deliver: Ensure your response is polite, professional, concise and in {language} only.
 6. Review response: Make sure that you have followed all of the above instructions, respond in the desired output format, and strictly stick to the provided KNOWLEDGE only to formulate your answer.

 ###
KNOWLEDGE:
{!$EinsteinSearch:sfdc_ai__DynamicRetriever.results}

 ###
QUESTION:
{!$Input:Query}

 
Clearly answer the user’s Query directly and logically, based only on well-reasoned deductions drawn from the Context below. 
Then respond to the user’s Query logically, methodologically, thoughtfully, and thoroughly from multiple perspectives, emphasizing different viewpoints based on Context with details and careful reasoning. 
Provide details with organized structure in your response. Consider alternative perspectives or approaches that could challenge your current line of reasoning. 
If you don’t know how to answer the query, or if there is not sufficient context, please respond with ‘Sorry, I couldn't find sufficient information to answer your question.’
Evaluate the evidence or data supporting your reasoning, and identify any gaps or inconsistencies. 
Finally, ask questions to clarify the user’s intent while encouraging critical thinking and self-discovery about the user's Query. 
Clearly articulate with details what are facts versus what are opinions or beliefs. 
If you don't know the answer, ask questions to clarify the user’s intent. 
Pay attention to the entities mentioned in the user’s Query and make sure the context contains information about those entities. 

Context:
{!$EinsteinSearch:ArticleRetriever_1Cx_Q8Qa1857028.results}

Query:
{!$Input:question}

Format instructions: 
Format your response with Markdown structures as follows: 
Start with an overview of the topic.  
List the key points in a list and emphasize any critical terms using bold. 
For subsequent sections, create headings and subheadings that incorporate the subqueries implicitly. 
If there are any steps or sequential data, present them in an ordered list. 
End with a conclusion.




public static List
<Response> searchSimilarCases(List
<Request> requests) {
List
<Response> responses = new List
<Response>();
Response response = new Response();

String caseDescription = requests[0].RelatedEntity.Description;

ConnectApi.CdpQueryInput input = new ConnectApi.CdpQueryInput();
input.sql = 'SELECT DISTINCT v.score__cScore__c, c.ssot__Id__cId__c, c.ssot__Subject__c
Subject__c"+
'FROM vector_search(\case_chunk_vector__dlm\;\" + caseDescription + '\', \'\', 200) v ' +
'JOIN Case_Chunks__dlm cc ON v.chunk_id__c = cc.chunkid__c ' +
'JOIN ssot__Case__dlm c ON cc.parentid__c = c.ssot__Id__c ' +
WHERE cc.column__c != \'ssot__Subject__c\' AND c.ssot__DataSourceId__c = \'CRM\' ' +
'LIMIT 10';

ConnectApi.CdpQueryOutput output = ConnectApi.CdpQuery.queryANSISql(input);

List Object> data = output.data;
String scs = '';
for (Object searchRecord : data) {

Map
<String, Object>myMap = (Map
<String, Object>) JSON.deserializeUntyped(JSON.serialize(searchRecord));
// check for access of case record for the current user
if (SimilarCasesSearch.getUserRecordAccess((String) myMap.get('Id__c'))) {
Map
<String, String> sc = new Map
<String, String>();
sc.put('Id', (String) myMap.get('Id__c'));
sc.put('Similar_Case__c', (String) myMap.get('Id__c'));
sc.put('Name', (String) myMap.get('Subject__c'));
sc.put('Score__c', String.valueOf(myMap.get('Score__c')));
scs = scs + JSON.serialize(sc);
}
}
response.Prompt = scs;
responses.add(response);
return responses;
}

Standard Flow actions

Flow Action Description
Detect Language Detects the language of a query, which can be passed as a filter value to a retriever node for dynamic filtering (by language).
Transform Query for {Case/Email/Conversation} Each of these three nodes invoke an LLM transformation that changes a case, email, or conversation into a query that is optimized for retrieval. It improves the query that the retriever passes onto the search index. For example, the conversation-to-query action avoids querying the search index with non-relevant messages such as “How can I help you?” or “How are you today?” Similarly, the case-to-query and email-to-query extract relevant information from the text to remove greetings and other text that shouldn’t be used for search. 

 global with sharing class RetrieverProcessor {
 
    @InvocableMethod
    class public static List
<String> GetWebProduct(List
<Requests> queryResults)
       {
            List
<String> resultsList = new List
<String>();
            for (Requests queryResult : queryResults) {
                List<String> segments = new List
<String>();
               for (ConnectApi.MlRetrieverQueryResultDocumentRepresentation document: queryResult.queryResult.searchResults) {
               for (ConnectApi.MlRetrieverQueryResultDocumentContentRepresentation content: document.result) {
                    if (content.fieldName.equals('Chunk')) {
                        segments.add(content.value.toString());
                   }
                }}
              if

 if (segments.size() == 0) {
                    resultsList.add('No results');
                } else {
                    resultsList.add(String.join(segments, ','));
                }  
    }
return resultsList;
       }      
    global class Requests {
        @InvocableVariable
        global ConnectApi.MlRetrieverQueryResultRepresentation queryResult;
        
    }
}
  
SELECT 'INDEX' AS Location, COUNT(DISTINCT rc.SourceRecordId__c) AS ArticleCount, now() AS Timestamp 
FROM 
<chunk DMO of the Search Index> rc
UNION
SELECT 'DMO' AS Location, COUNT(DISTINCT  kav.Id__c)  AS ArticleCount, now() AS Timestamp 
FROM 
<DMO that was indexed, e.g. Knowledge Article Version> kav
ORDER BY Location;
  

Metrics

Metric Answers Definition What does it help with?
Context Relevance How relevant is the retrieved content to the query? LLM-based evaluation Isolate retrieval problems
Faithfulness How grounded is the response in the retrieved content? LLM-based evaluation Isolate LLM generation problems
Answer Relevance How relevant is the answer to the query? LLM-based evaluation Overall response metric of the answer. Especially useful in combination with context relevance and faithfulness.
ConnectApi.CdpQueryInput input = new ConnectApi.CdpQueryInput();
input.sql = 'SELECT r.Label_c__c Label, COUNT(r.Label_c__c) AS counter FROM vector_search(table(Intent_Training_index__dlm), topic,'' , 50) v JOIN Intent_Training_chunk__dlm c ON v.RecordId__c = c.RecordId__c JOIN Intent_Training__dlm r ON r.Id__c = c.SourceRecordId__c GROUP BY r.Label_c__c ORDER BY counter DESC LIMIT 1;
                
ConnectApi.CdpQueryOutput output = ConnectApi.CdpQuery.queryANSISql(input);