{"id":11153,"date":"2025-05-01T05:00:00","date_gmt":"2025-05-01T12:00:00","guid":{"rendered":"https:\/\/www.salesforce.com\/?p=11153"},"modified":"2025-05-01T15:00:45","modified_gmt":"2025-05-01T14:00:45","slug":"ai-research-agentic-advancements","status":"publish","type":"post","link":"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/","title":{"rendered":"Salesforce AI Research Delivers New Benchmarks, Guardrails, and Models to Make Future Agents More Intelligent, Trusted, and Versatile"},"content":{"rendered":"\n<p>One of today\u2019s most pressing AI challenges is the gap between a <a href=\"https:\/\/www.salesforce.com\/artificial-intelligence\/what-are-large-language-models\/\" target=\"_blank\" rel=\"noreferrer noopener\">Large Language Model\u2019s (LLM\u2019s)<\/a> raw intelligence and how that intelligence translates into consistent, real-world performance when powering autonomous <a href=\"https:\/\/www.salesforce.com\/in\/agentforce\/what-are-ai-agents\/\" target=\"_blank\" rel=\"noreferrer noopener\">AI agents<\/a>. This challenge is known as <a href=\"https:\/\/www.salesforce.com\/blog\/jagged-intelligence\/\" target=\"_blank\" rel=\"noreferrer noopener\">jagged intelligence<\/a>. While LLMs may excel at things like writing polished essays or poems and translating languages with impressive fluency, their brilliance often stumbles when it comes to reliably executing tasks amid the messy realities of enterprise environments.<\/p>\n\n\n\n<p>This inconsistency can lead to misaligned actions, flawed judgments, and deviations from critical business logic and established guidelines. Within high-stakes enterprise settings, this lack of predictability is a major liability; a single misstep can disrupt operations, erode customer trust, and inflict substantial financial or reputational damage.<\/p>\n\n\n\n<p>Salesforce customers demand trustworthy AI that delivers reliable performance at scale, adapting intelligently to complex scenarios and evolving business needs. Their fundamental expectation isn&#8217;t just AI\u2019s functionality but dependable, enterprise-grade consistency. For businesses, AI isn&#8217;t a casual pastime. It&#8217;s a mission-critical tool that requires unwavering predictability.<\/p>\n\n\n\n<p>To address the critical challenges of jagged intelligence\u200c \u2014 \u200cwhere AI agents can perform brilliantly in some areas while failing unpredictably in others\u200c \u2014 \u200c<a href=\"https:\/\/www.salesforceairesearch.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">Salesforce AI Research<\/a> operates with three core pillars in mind:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Foundational research:<\/strong> Begin by pinpointing key industry challenges and driving research that directly addresses them. This includes creating novel benchmarks to measure and reduce jaggedness, building models with deeper contextual understanding, and publishing cutting-edge research to move the field forward.<\/li>\n\n\n\n<li><strong>Customer incubation:<\/strong> Then, pilot prototypes with customers in real-world simulation environments. By co-innovating directly with users, refine AI agents through continuous learning, real-world feedback, and stress-testing in complex workflows.<\/li>\n\n\n\n<li><strong>Product innovation:<\/strong> Finally, prove value and readiness by transforming prototypes and research pilots into enterprise-grade solutions. This fuels innovation across Salesforce products and technologies, including Agentforce, the agentic layer of the Salesforce Platform, its Atlas Reasoning Engine, enhanced <a href=\"https:\/\/www.salesforce.com\/in\/agentforce\/what-is-rag\/\" target=\"_blank\" rel=\"noreferrer noopener\">Retrieval-Augmented Generation (RAG)<\/a> capabilities, and the Salesforce Trust Layer.\u00a0\u00a0<\/li>\n<\/ul>\n\n\n\n<p>Through this deliberate cycle of investigating, piloting, and proving, Salesforce AI Research is making AI agents more intelligent, trustworthy, versatile, and enterprise-ready. By refining models and continuously iterating with real-world customer feedback, Salesforce AI Research is making it possible to create agents that meet the demanding needs of enterprise environments, ensuring they can seamlessly integrate into workflows, adapt to deliver on complex tasks, and perform with greater reliability.<\/p>\n\n\n\n<aside class=\"contextual-driver has-text-align-left contextual-driver--einstein\">\n\t<header class=\"tidbit-header\">\n\t\t<h2 class=\"tidbit-head\">\n\t\t\tThe Growing Need for &#039;Enterprise General Intelligence&#039; in Business.\t\t<\/h2>\n\t<\/header>\n\t<div class=\"tidbit-body\">\n\t\t<p><\/p>\t<\/div>\n\t\t\t<p class=\"tidbit-link label\">\n\t\t\t<a class=\"label has-right-arrow has-right-arrow--small\" href=\"https:\/\/www.salesforce.com\/news\/stories\/enterprise-general-intelligence-testing\/\" target=\"_blank\">\n\t\t\t\t<span>LEARN MORE<\/span>\n\t\t\t\t<svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 14 10\">\n\t<path d=\"M13.92 5.38a1 1 0 000-.76.9.9 0 00-.17-.26.25.25 0 000-.07l-4-4a1 1 0 00-1.46 1.42L10.59 4H1.07a1 1 0 000 2h9.52l-2.3 2.29a1 1 0 000 1.42 1 1 0 001.42 0l4-4a.25.25 0 000-.07.9.9 0 00.21-.26z\" \/>\n<\/svg>\n\t\t\t<\/a>\n\t\t<\/p>\n\t\t\t\t<img class=\"tidbit img-einstein\"\n\t\tsrcset=\"\n\t\t\thttps:\/\/www.salesforce.com\/in\/news\/wp-content\/themes\/newsroom\/assets\/images\/article-tidbit-einstein-large@0.5x.png 84w,\n\t\t\thttps:\/\/www.salesforce.com\/in\/news\/wp-content\/themes\/newsroom\/assets\/images\/article-tidbit-einstein-large@1x.png 168w,\n\t\t\thttps:\/\/www.salesforce.com\/in\/news\/wp-content\/themes\/newsroom\/assets\/images\/article-tidbit-einstein-large@2x.png 336w\n\t\t\"\n\t\tsrc=\"https:\/\/www.salesforce.com\/in\/news\/wp-content\/themes\/newsroom\/assets\/images\/article-tidbit-einstein-large@1x.png\"\n\t\talt=\"Illustration of einstein\"\n\t\tsizes=\"(min-width:1024px) 175px, 130px\"\n\t>\n\t<\/aside>\n\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-dive-deeper-nbsp\"><strong>Dive deeper:&nbsp;<\/strong><\/h2>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"h-building-intelligent-agents-with-enhanced-reasoning-and-rag\"><strong>Building intelligent agents with enhanced reasoning and RAG<\/strong><\/h4>\n\n\n\n<p>Salesforce AI Research is focused on advancing <a href=\"https:\/\/www.salesforce.com\/agentforce\/intelligent-agents\/\" target=\"_blank\" rel=\"noreferrer noopener\">intelligent agents<\/a> with stronger reasoning and RAG capabilities, enabling them to access, synthesize, and apply information in real time \u2014 helping to reduce jaggedness and drive more consistent, contextually aware decisions across complex tasks.<\/p>\n\n\n\n<p><strong>Bringing intelligent agents to life:&nbsp;<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>A public benchmark to quantify AI jaggedness:<\/strong> While AI struggles with consistent reasoning, Salesforce\u2019s public <a href=\"https:\/\/huggingface.co\/datasets\/Salesforce\/SIMPLE\" target=\"_blank\" rel=\"noreferrer noopener\">SIMPLE<\/a> dataset offers a clear benchmark to help. Featuring 225 straightforward reasoning questions that are easy for humans but challenging for AI, SIMPLE helps quantify LLM jaggedness by tracking performance gaps to help guide the development of more reliable AI for enterprise applications.<\/li>\n\n\n\n<li><strong>Enhanced embedding model capabilities:<\/strong> As AI processes more unstructured data, understanding context is key. Salesforce AI Research is advancing text-embedding models like <a href=\"https:\/\/www.salesforce.com\/blog\/sfr-top-performing-text-embedding-model\/\" target=\"_blank\" rel=\"noreferrer noopener\">SFR-Embedding<\/a>, which convert text to meaningful structured data for better AI information retrieval. SFR-Embedding leads on the <a href=\"https:\/\/huggingface.co\/spaces\/mteb\/leaderboard\" target=\"_blank\" rel=\"noreferrer noopener\">MTEB benchmark<\/a> across 56 datasets, excelling in retrieval and clustering. Available soon in Salesforce <a href=\"https:\/\/www.salesforce.com\/data\/\" target=\"_blank\" rel=\"noreferrer noopener\">Data Cloud<\/a>, Salesforce&#8217;s hyperscale data engine, it will enhance RAG for more accurate AI responses, setting a new standard for reliable enterprise AI.<\/li>\n\n\n\n<li><strong>Specialized code embedding models for developers:<\/strong> Developers need efficient, accurate, and scalable AI for code retrieval and generation. To account for those needs, Salesforce AI Research launched <a href=\"https:\/\/www.salesforce.com\/blog\/sfr-embedding-code\/\" target=\"_blank\" rel=\"noreferrer noopener\">SFR-Embedding-Code<\/a>, a specialized code embedding model family based on SFR-Embedding. Mapping code and text to a shared space, it enables high-quality code search, streamlining development. The 7B model leads the <a href=\"https:\/\/archersama.github.io\/coir\/\" target=\"_blank\" rel=\"noreferrer noopener\">CoIR benchmark<\/a>, while smaller models (400M, 2B) offer efficient, cost-effective, smaller solutions that don\u2019t significantly alter performance capabilities.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"h-strengthening-customer-trust-with-benchmarking-testing-and-guardrails\"><strong>Strengthening customer trust with benchmarking, testing, and guardrails<\/strong><\/h4>\n\n\n\n<p>To strengthen customer trust and tackle jaggedness head-on, Salesforce AI Research is applying rigorous benchmarking, continuous testing, and robust guardrails. By systematically evaluating agent behavior against real-world conditions and setting clear boundaries for performance and safety, Salesforce AI Research is also ensuring agents behave consistently, predictably, and reliably in enterprise environments.<\/p>\n\n\n\n<p><strong>Engineering trust into every agent:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>A new framework designed to test and evaluate AI agents:<\/strong> Evaluating enterprise AI agents&#8217; ability to perform business-level tasks is a critical priority and a persistent challenge for CIOs and IT leaders. To directly address this, Salesforce AI Research introduced <a href=\"https:\/\/arxiv.org\/abs\/2411.02305\" target=\"_blank\" rel=\"noreferrer noopener\">CRMArena<\/a>: a novel benchmarking framework meticulously designed to simulate realistic, professionally grounded CRM scenarios. This focused approach enables comprehensive testing and targeted improvement of AI agent performance, ensuring their safety, reliability, and the cultivation of robust enterprise trust.<\/li>\n\n\n\n<li><strong>New agent guardrail features enhance trust and security: <\/strong>Agentforce&#8217;s guardrails establish clear boundaries for agent behavior based on business needs, policies, and standards, ensuring agents act within predefined limits. Salesforce\u2019s Trust Layer provides an extra layer of protection for enterprise agents. Salesforce AI Research is continuously developing new models and frameworks to enhance tools like toxicity detection and instruction adherence within the Trust Layer to better defend against prompt injection attacks. As part of that, Salesforce AI Research recently introduced <a href=\"https:\/\/www.salesforce.com\/blog\/sfr-guard-ensuring-llm-safety-and-integrity-in-crm-applications\/\" target=\"_blank\" rel=\"noreferrer noopener\">SFR-Guard<\/a> \u2014 a family of models trained on both publicly available data and CRM-specialized internal data, designed to further strengthen the trust and reliability of AI agents in business operations.&nbsp;<\/li>\n\n\n\n<li><strong>A new benchmark for assessing models in contextual settings:<\/strong> Ensuring AI generates accurate, contextual answers is crucial for enterprise trust, but traditional benchmarks often fall short. Because of that, Salesforce AI Research launched <a href=\"https:\/\/www.salesforce.com\/blog\/contextualjudgebench\/\" target=\"_blank\" rel=\"noreferrer noopener\">ContextualJudgeBench<\/a>, a novel benchmark evaluating LLM-based judge models in context. Testing more than 2,000 challenging response pairs, it assesses accuracy, conciseness, faithfulness, and appropriate refusal to answer\u200c \u2014 \u200call vital requirements for real-world enterprise AI.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"h-enhancing-model-versatility-by-moving-beyond-solely-relying-on-llms\"><strong>Enhancing model versatility by moving beyond solely relying on LLMs<\/strong><\/h4>\n\n\n\n<p>By introducing specialized models, structured knowledge sources, and retrieval-augmented techniques, Salesforce AI Research is building agents that reason more reliably, adapt to the unique demands of enterprise workflows, and deliver more consistent, versatile performance \u2014 helping businesses operate with greater efficiency, trust, and agility.<\/p>\n\n\n\n<p><strong>Driving the future of versatile enterprise agents:&nbsp;<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>A major upgrade to action model capabilities:<\/strong> As AI models become <a href=\"https:\/\/www.salesforce.com\/news\/stories\/unified-platform-unlocks-ai\/\" target=\"_blank\" rel=\"noreferrer noopener\">increasingly commoditized<\/a>, there\u2019s a growing need for smaller, more efficient alternatives that execute tasks at lower costs and with less resources. As such, Salesforce AI Research upgraded its xLAM (<a href=\"https:\/\/www.salesforce.com\/agentforce\/large-action-models\/\" target=\"_blank\" rel=\"noreferrer noopener\">Large Action Model<\/a>) family with multi-turn conversation support and a wider range of smaller models for increased accessibility. Unlike models that just predict words, <a href=\"https:\/\/www.salesforce.com\/blog\/xlam-large-action-models\/\" target=\"_blank\" rel=\"noreferrer noopener\">Salesforce\u2019s xLAM family<\/a> predicts actions for faster real-world task execution, crucial for tool use and function calling. While some models exceed 200B parameters, xLAM starts at 1B, offering a lightweight, integrable footprint with robust planning, reasoning, and function execution, outperforming even GPT-4o and GPT-4.5 previews on <a href=\"https:\/\/gorilla.cs.berkeley.edu\/leaderboard.html\" target=\"_blank\" rel=\"noreferrer noopener\">key benchmarks<\/a> for enterprise agents.<\/li>\n\n\n\n<li><strong>A multimodal action model family for multi-step problem-solving:<\/strong> Current multimodal models struggle with complex, multi-step problems, often lacking clear reasoning capabilities. To address that gap, Salesforce <a href=\"https:\/\/www.salesforce.com\/blog\/taco-multimodal-action-models\/\" target=\"_blank\" rel=\"noreferrer noopener\">launched TACO<\/a>, a multimodal action model family that tackles these tasks by generating chains of thought-and-action (CoTA). TACO breaks tasks down into simple steps while integrating real-time action, which improves the AI&#8217;s ability to interpret and respond to intricate queries. Salesforce testing showed this achieved average gains of up to 4% across eight benchmarks and up to 20% on the challenging <a href=\"https:\/\/arxiv.org\/pdf\/2412.05479\" target=\"_blank\" rel=\"noreferrer noopener\">MMVet benchmark<\/a>.<\/li>\n<\/ul>\n\n\n\n<aside class=\"more-from-block more-from-block--2 alignfull wp-block-newsroom-moreontopic\">\n\t<div class=\"more-from-block__content\">\n\t\t\t\t\t<h4 class=\"more-from-block__title\">Related<\/h4>\n\t\t\n\t\t<div class=\"more-from-block__cards more-from-block__cards--2\"\n\t\t\t\t>\n\t\t\t\n\n<article\t\t\titemscope=\"\"\n\t\titemtype=\"https:\/\/schema.org\/Article\"\n\t\tdata-card-id=\"13341\"\n\t\tclass=\"content-card content-card--large-inline is-entire-area-clickable content-card--is-boxed\"\n\t\t\tdata-clickable-area-link=\"https:\/\/www.salesforce.com\/in\/news\/stories\/the-agentic-edge-navigating-the-2026-workforce-revolution\/\"\n\t>\n\t\t\t<div class=\"content-card__image-container\" itemprop=\"image\" itemscope itemtype=\"https:\/\/schema.org\/ImageObject\">\n\t\t\t<img width=\"1024\" height=\"792\" src=\"https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/07\/image_df71e1.png?w=1024\" class=\"content-card__image\" alt=\"Graphic highlighting the four principals of building the human agent workforce.\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" loading=\"lazy\" decoding=\"async\" srcset=\"https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/07\/image_df71e1.png 1600w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/07\/image_df71e1.png?w=300&amp;h=232 300w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/07\/image_df71e1.png?w=768&amp;h=594 768w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/07\/image_df71e1.png?w=1024&amp;h=792 1024w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/07\/image_df71e1.png?w=1536&amp;h=1188 1536w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/07\/image_df71e1.png?w=264&amp;h=204 264w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/07\/image_df71e1.png?w=472&amp;h=365 472w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/07\/image_df71e1.png?w=678&amp;h=524 678w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/07\/image_df71e1.png?w=150&amp;h=116 150w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/07\/image_df71e1.png?w=250&amp;h=193 250w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/07\/image_df71e1.png?w=1218&amp;h=942 1218w\" \/>\t\t\t\t\t<\/div>\n\t\n\t<div class=\"content-card__content-container content-card__content-container--style-\">\n\t\t\n\t\t\n\t\t\t\t\t<h3 itemprop=\"headline\" class=\"content-card__title\">\n\t\t\t\t\t\t\t\t\t<a\n\t\t\t\t\t\thref=\"https:\/\/www.salesforce.com\/in\/news\/stories\/the-agentic-edge-navigating-the-2026-workforce-revolution\/\"\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\ttarget=\"_blank\"\n\t\t\t\t\t\t\t\t\t\t\t\tclass=\"content-card__title-link\"\n\t\t\t\t\t>\n\t\t\t\t\t\t\t\tThe Agentic Edge: Navigating the 2026 Workforce Revolution\t\t\t\t<\/a>\t\t\t<\/h3>\n\t\t\n\t\t\t\t\n\t\t\n\t\t\t\t\t<div itemprop=\"readTime\" class=\"content-card__read-time\">\n\t\t\t\t4 min read\t\t\t<\/div>\n\t\t\n\t\t\t<\/div>\n\n<\/article>\n\n\n\n<article\t\t\titemscope=\"\"\n\t\titemtype=\"https:\/\/schema.org\/Article\"\n\t\tdata-card-id=\"13312\"\n\t\tclass=\"content-card content-card--large-inline is-entire-area-clickable content-card--is-boxed\"\n\t\t\tdata-clickable-area-link=\"https:\/\/www.salesforce.com\/in\/news\/stories\/salesforce-south-asia-anthem\/\"\n\t>\n\t\t\t<div class=\"content-card__image-container\" itemprop=\"image\" itemscope itemtype=\"https:\/\/schema.org\/ImageObject\">\n\t\t\t<img width=\"1024\" height=\"575\" src=\"https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2026\/03\/Screenshot-2026-03-06-at-4.51.16-PM-1.png?w=1024\" class=\"content-card__image\" alt=\"One Salesforce, One South Asia\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" loading=\"lazy\" decoding=\"async\" srcset=\"https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2026\/03\/Screenshot-2026-03-06-at-4.51.16-PM-1.png 2228w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2026\/03\/Screenshot-2026-03-06-at-4.51.16-PM-1.png?w=300&amp;h=169 300w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2026\/03\/Screenshot-2026-03-06-at-4.51.16-PM-1.png?w=768&amp;h=432 768w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2026\/03\/Screenshot-2026-03-06-at-4.51.16-PM-1.png?w=1024&amp;h=575 1024w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2026\/03\/Screenshot-2026-03-06-at-4.51.16-PM-1.png?w=1536&amp;h=863 1536w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2026\/03\/Screenshot-2026-03-06-at-4.51.16-PM-1.png?w=2048&amp;h=1151 2048w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2026\/03\/Screenshot-2026-03-06-at-4.51.16-PM-1.png?w=264&amp;h=148 264w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2026\/03\/Screenshot-2026-03-06-at-4.51.16-PM-1.png?w=500&amp;h=281 500w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2026\/03\/Screenshot-2026-03-06-at-4.51.16-PM-1.png?w=678&amp;h=381 678w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2026\/03\/Screenshot-2026-03-06-at-4.51.16-PM-1.png?w=150&amp;h=84 150w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2026\/03\/Screenshot-2026-03-06-at-4.51.16-PM-1.png?w=314&amp;h=177 314w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2026\/03\/Screenshot-2026-03-06-at-4.51.16-PM-1.png?w=343&amp;h=193 343w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2026\/03\/Screenshot-2026-03-06-at-4.51.16-PM-1.png?w=1414&amp;h=796 1414w, https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2026\/03\/Screenshot-2026-03-06-at-4.51.16-PM-1.png?w=1218&amp;h=684 1218w\" \/>\t\t\t\t\t<\/div>\n\t\n\t<div class=\"content-card__content-container content-card__content-container--style-\">\n\t\t\n\t\t\n\t\t\t\t\t<h3 itemprop=\"headline\" class=\"content-card__title\">\n\t\t\t\t\t\t\t\t\t<a\n\t\t\t\t\t\thref=\"https:\/\/www.salesforce.com\/in\/news\/stories\/salesforce-south-asia-anthem\/\"\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\ttarget=\"_blank\"\n\t\t\t\t\t\t\t\t\t\t\t\tclass=\"content-card__title-link\"\n\t\t\t\t\t>\n\t\t\t\t\t\t\t\tThe Sound of Salesforce South Asia: Celebrating Our People and Culture\t\t\t\t<\/a>\t\t\t<\/h3>\n\t\t\n\t\t\t\t\n\t\t\n\t\t\t\t\t<div itemprop=\"readTime\" class=\"content-card__read-time\">\n\t\t\t\t1 min read\t\t\t<\/div>\n\t\t\n\t\t\t<\/div>\n\n<\/article>\n\n\t\t<\/div>\n\t\t<noscript>\n<div class=\"pagination-fallback\">\n\n\t\t<a href=\"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/posts\/11153\/page\/2\/?bc=OTH#section-title\">\n\t\t\tOlder Posts\t\t<\/a>\n\t\t<\/div>\n<\/noscript>\n\t<\/div>\n<\/aside>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"h-shaping-the-future-of-enterprise-ai-with-intelligent-trusted-and-versatile-ai-agents\"><strong>Shaping the future of enterprise AI with intelligent, trusted, and versatile AI agents<\/strong><\/h4>\n\n\n\n<p>Salesforce AI Research continues to lay the groundwork for the next generation of enterprise AI agents, driven by intelligence, trust, and versatility, to help businesses work smarter and serve customers more effectively.&nbsp;<\/p>\n\n\n\n<p>These innovations span everything from benchmarking jaggedness and advancing reasoning to embedding trust and building action-driven agents. Each research paper and new model release contributes to Salesforce\u2019s broader mission of moving beyond prototypes to deliver AI systems that reliably perform in complex business environments.<\/p>\n\n\n\n<p>As enterprise needs evolve, Salesforce remains committed to what matters most to customers: Translating cutting-edge research into trusted products that deliver real results at scale.&nbsp;&nbsp;<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"h-more-information\"><strong>More information:<\/strong><\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Read a byline about <a href=\"https:\/\/www.salesforce.com\/news\/stories\/enterprise-general-intelligence-testing\/\" target=\"_blank\" rel=\"noreferrer noopener\">EGI from Salesforce\u2019s Head of AI Research<\/a><\/li>\n\n\n\n<li>Learn more about <a href=\"https:\/\/www.salesforceairesearch.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">Salesforce AI Research<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>One of today\u2019s most pressing AI challenges is the gap between a Large Language Model\u2019s (LLM\u2019s) raw intelligence and how that intelligence translates into consistent, real-world performance when powering autonomous AI agents. This challenge is known as jagged intelligence. While LLMs may excel at things like writing polished essays or poems and translating languages with [&hellip;]<\/p>\n","protected":false},"author":0,"featured_media":11152,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"sf_subhead":"","sf_i18n_disclaimer":false,"_jetpack_memberships_contains_paid_content":false,"alternateThumbnailId":0,"sf_product_cta_id":0,"footnotes":""},"categories":[],"tags":[],"sf_content_type":[1099],"sf_theme":[1114],"sf_topic":[2802,1092,1093,1197],"sf_product":[],"sf_industry":[],"sf_role":[],"sf_multimedia_asset":[],"sf_location":[1094],"sf_collection":[],"sf_visibility":[],"coauthors":[],"class_list":["post-11153","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","sf_content_type-snapshots","sf_theme-artificial-intelligence","sf_topic-agents","sf_topic-artificial-intelligence","sf_topic-digital-transformation","sf_topic-trust","sf_location-global"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.2 (Yoast SEO v27.2) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Salesforce AI Research Details Agentic Advancements - Salesforce<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Salesforce AI Research Delivers New Benchmarks, Guardrails, and Models to Make Future Agents More Intelligent, Trusted, and Versatile\" \/>\n<meta property=\"og:description\" content=\"One of today\u2019s most pressing AI challenges is the gap between a Large Language Model\u2019s (LLM\u2019s) raw intelligence and how that intelligence translates into consistent, real-world performance when powering autonomous AI agents. This challenge is known as jagged intelligence. While LLMs may excel at things like writing polished essays or poems and translating languages with [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/\" \/>\n<meta property=\"og:site_name\" content=\"Salesforce\" \/>\n<meta property=\"article:published_time\" content=\"2025-05-01T12:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-05-01T14:00:45+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/05\/Salesforce-AI-Research-Unveils-New-Benchmarks-Guardrails-and-Models-to-Make-Future-Agents-More-Intelligent-Trusted-and-Versatile.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"675\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"7 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"NewsArticle\",\"@id\":\"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/\"},\"author\":{\"name\":\"\",\"@id\":\"\"},\"headline\":\"Salesforce AI Research Delivers New Benchmarks, Guardrails, and Models to Make Future Agents More Intelligent, Trusted, and Versatile\",\"datePublished\":\"2025-05-01T12:00:00+00:00\",\"dateModified\":\"2025-05-01T14:00:45+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/\"},\"wordCount\":1400,\"image\":{\"@id\":\"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/05\/Salesforce-AI-Research-Unveils-New-Benchmarks-Guardrails-and-Models-to-Make-Future-Agents-More-Intelligent-Trusted-and-Versatile.png\",\"inLanguage\":\"en-IN\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/\",\"url\":\"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/\",\"name\":\"Salesforce AI Research Details Agentic Advancements - Salesforce\",\"isPartOf\":{\"@id\":\"https:\/\/www.salesforce.com\/in\/news\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/05\/Salesforce-AI-Research-Unveils-New-Benchmarks-Guardrails-and-Models-to-Make-Future-Agents-More-Intelligent-Trusted-and-Versatile.png\",\"datePublished\":\"2025-05-01T12:00:00+00:00\",\"dateModified\":\"2025-05-01T14:00:45+00:00\",\"author\":{\"@id\":\"\"},\"breadcrumb\":{\"@id\":\"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/#breadcrumb\"},\"inLanguage\":\"en-IN\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-IN\",\"@id\":\"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/#primaryimage\",\"url\":\"https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/05\/Salesforce-AI-Research-Unveils-New-Benchmarks-Guardrails-and-Models-to-Make-Future-Agents-More-Intelligent-Trusted-and-Versatile.png\",\"contentUrl\":\"https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/05\/Salesforce-AI-Research-Unveils-New-Benchmarks-Guardrails-and-Models-to-Make-Future-Agents-More-Intelligent-Trusted-and-Versatile.png\",\"width\":1200,\"height\":675,\"caption\":\"Man sits at laptop.\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.salesforce.com\/in\/news\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Salesforce AI Research Delivers New Benchmarks, Guardrails, and Models to Make Future Agents More Intelligent, Trusted, and Versatile\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.salesforce.com\/in\/news\/#website\",\"url\":\"https:\/\/www.salesforce.com\/in\/news\/\",\"name\":\"Salesforce\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.salesforce.com\/in\/news\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-IN\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Salesforce AI Research Details Agentic Advancements - Salesforce","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/","og_type":"article","og_title":"Salesforce AI Research Delivers New Benchmarks, Guardrails, and Models to Make Future Agents More Intelligent, Trusted, and Versatile","og_description":"One of today\u2019s most pressing AI challenges is the gap between a Large Language Model\u2019s (LLM\u2019s) raw intelligence and how that intelligence translates into consistent, real-world performance when powering autonomous AI agents. This challenge is known as jagged intelligence. While LLMs may excel at things like writing polished essays or poems and translating languages with [&hellip;]","og_url":"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/","og_site_name":"Salesforce","article_published_time":"2025-05-01T12:00:00+00:00","article_modified_time":"2025-05-01T14:00:45+00:00","og_image":[{"width":1200,"height":675,"url":"https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/05\/Salesforce-AI-Research-Unveils-New-Benchmarks-Guardrails-and-Models-to-Make-Future-Agents-More-Intelligent-Trusted-and-Versatile.png","type":"image\/png"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"NewsArticle","@id":"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/#article","isPartOf":{"@id":"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/"},"author":{"name":"","@id":""},"headline":"Salesforce AI Research Delivers New Benchmarks, Guardrails, and Models to Make Future Agents More Intelligent, Trusted, and Versatile","datePublished":"2025-05-01T12:00:00+00:00","dateModified":"2025-05-01T14:00:45+00:00","mainEntityOfPage":{"@id":"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/"},"wordCount":1400,"image":{"@id":"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/#primaryimage"},"thumbnailUrl":"https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/05\/Salesforce-AI-Research-Unveils-New-Benchmarks-Guardrails-and-Models-to-Make-Future-Agents-More-Intelligent-Trusted-and-Versatile.png","inLanguage":"en-IN"},{"@type":"WebPage","@id":"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/","url":"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/","name":"Salesforce AI Research Details Agentic Advancements - Salesforce","isPartOf":{"@id":"https:\/\/www.salesforce.com\/in\/news\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/#primaryimage"},"image":{"@id":"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/#primaryimage"},"thumbnailUrl":"https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/05\/Salesforce-AI-Research-Unveils-New-Benchmarks-Guardrails-and-Models-to-Make-Future-Agents-More-Intelligent-Trusted-and-Versatile.png","datePublished":"2025-05-01T12:00:00+00:00","dateModified":"2025-05-01T14:00:45+00:00","author":{"@id":""},"breadcrumb":{"@id":"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/#breadcrumb"},"inLanguage":"en-IN","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/"]}]},{"@type":"ImageObject","inLanguage":"en-IN","@id":"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/#primaryimage","url":"https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/05\/Salesforce-AI-Research-Unveils-New-Benchmarks-Guardrails-and-Models-to-Make-Future-Agents-More-Intelligent-Trusted-and-Versatile.png","contentUrl":"https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/05\/Salesforce-AI-Research-Unveils-New-Benchmarks-Guardrails-and-Models-to-Make-Future-Agents-More-Intelligent-Trusted-and-Versatile.png","width":1200,"height":675,"caption":"Man sits at laptop."},{"@type":"BreadcrumbList","@id":"https:\/\/www.salesforce.com\/in\/news\/stories\/ai-research-agentic-advancements\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.salesforce.com\/in\/news\/"},{"@type":"ListItem","position":2,"name":"Salesforce AI Research Delivers New Benchmarks, Guardrails, and Models to Make Future Agents More Intelligent, Trusted, and Versatile"}]},{"@type":"WebSite","@id":"https:\/\/www.salesforce.com\/in\/news\/#website","url":"https:\/\/www.salesforce.com\/in\/news\/","name":"Salesforce","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.salesforce.com\/in\/news\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-IN"}]}},"jetpack_featured_media_url":"https:\/\/www.salesforce.com\/in\/news\/wp-content\/uploads\/sites\/20\/2025\/05\/Salesforce-AI-Research-Unveils-New-Benchmarks-Guardrails-and-Models-to-Make-Future-Agents-More-Intelligent-Trusted-and-Versatile.png","jetpack_sharing_enabled":true,"distributor_meta":false,"distributor_terms":false,"distributor_media":false,"distributor_original_site_name":"Salesforce","distributor_original_site_url":"https:\/\/www.salesforce.com\/in\/news","push-errors":false,"_links":{"self":[{"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/posts\/11153","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/comments?post=11153"}],"version-history":[{"count":2,"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/posts\/11153\/revisions"}],"predecessor-version":[{"id":11156,"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/posts\/11153\/revisions\/11156"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/media\/11152"}],"wp:attachment":[{"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/media?parent=11153"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/categories?post=11153"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/tags?post=11153"},{"taxonomy":"sf_content_type","embeddable":true,"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/sf_content_type?post=11153"},{"taxonomy":"sf_theme","embeddable":true,"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/sf_theme?post=11153"},{"taxonomy":"sf_topic","embeddable":true,"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/sf_topic?post=11153"},{"taxonomy":"sf_product","embeddable":true,"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/sf_product?post=11153"},{"taxonomy":"sf_industry","embeddable":true,"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/sf_industry?post=11153"},{"taxonomy":"sf_role","embeddable":true,"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/sf_role?post=11153"},{"taxonomy":"sf_multimedia_asset","embeddable":true,"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/sf_multimedia_asset?post=11153"},{"taxonomy":"sf_location","embeddable":true,"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/sf_location?post=11153"},{"taxonomy":"sf_collection","embeddable":true,"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/sf_collection?post=11153"},{"taxonomy":"sf_visibility","embeddable":true,"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/sf_visibility?post=11153"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.salesforce.com\/in\/news\/wp-json\/wp\/v2\/coauthors?post=11153"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}