How we build trusted AI

Trust in AI doesn’t come from a single feature or checkpoint. It’s built through coordinated practices across design, development, testing, and governance. The Office of Ethical and Humane Use brings together Salesforce’s Responsible AI and Technology, Ethical Use Policy, and Product Accessibility teams to guide the development and deployment of AI systems.

Reviews that bring clarity

Before AI systems are deployed, they go through a structured review process that translates complex trust questions into clear product decisions.

Define the use case

Intake

We understand what the AI will do, who it’s for, and what data it touches.

Triage

We identify the highest-impact use cases and set the appropriate level of review and testing.

Review

We confirm guardrails, human oversight points, and acceptable behaviors before building deeper.

Testing

We run evaluations to find failure modes and confirm proposed mitigations actually work.

Implementation

Mitigations get built into the product experience — and tracked after release.

Testing that reflects real-world use

We test beyond expected scenarios to understand how systems behave under real-world conditions. The goal is to identify risks early and improve system performance before release.

Adversarial testing

Manual and semi-automated red teaming identify vulnerabilities such as inaccuracy, jailbreaks, and unsafe outputs.

Content safety testing

Evaluate how systems respond to harmful or sensitive content, including edge cases that require careful handling or escalation.

Accessibility testing

Evaluate experiences with automated and manual accessibility testing frameworks, including assistive technologies and people with disabilities, to validate usability and compliance with accessibility standards.

Employee trust testing

Global employees across the company and various business functions simulate real-world scenarios to evaluate how systems perform across varying use cases.

Large-scale stress testing

Hackathons and bug bounty programs help uncover hidden issues and edge cases.

Trust guardrails

Explore our platform-level guardrails across key risk areas.

Tabs

System policies

Core safety rules that remain in place even if a user tries to override them. Keeps the agent aligned with approved behaviors.

Subagent controls

Detects when a conversation moves outside the agent’s intended purpose. Prevents probing and redirects back to the allowed scope.

System prompts

The foundational instruction set that defines the agent’s role, behavioral constraints, and policy-aligned boundaries.

Prompt injection detection

Identifies and blocks malicious or hidden instructions in user input before they can influence the agent’s behavior.

Zero data retention

Prevents third-party model providers from storing or training on customer data. Adds contractual and technical barriers to reduce exposure.

Secure data retrieval

Grounds responses using permissioned retrieval. The agent can only access data that the current user is authorized to see.

Data access policies

Restricts specific datasets or fields from being used in AI workflows. Helps admins enforce privacy boundaries by design.

Supervisory models

Checks responses before they are shown to users to improve accuracy and reliability.

Retrieval-Augmented Generation (RAG)

Grounds responses in trusted enterprise data instead of relying only on model knowledge.

Inline citations

Provides links to the data sources used to generate a response so users can verify outputs.

Testing center

Batch-tests agent responses against known datasets to identify issues before deployment.

Context refinement

Focuses the model on the most relevant information when generating responses.

Input toxicity detection

Scans user inputs for harmful or unsafe language before processing.

Safety instructions

Guides the model away from generating unsafe or policy-violating content.

Output toxicity scoring and flagging

Evaluates responses for harmful content and flags them for review or intervention.

Model accuracy and robustness evaluation (1P models)

Tests for accuracy, consistency, and robustness for first-party AI models. Scope varies by product and model architecture.

Deterministic controls

Applies fixed, rule-based logic for sensitive actions instead of relying on probabilistic outputs.

Human handoff

Transfers interactions to a human when confidence is low or risk is high.

Pause-for-review gates

Introduces checkpoints before high-impact actions are completed.

Override controls

Allows users to pause, edit, or cancel agent actions at any time.

Agentforce observability and analytics

Provides visibility into how agents operate, including their decision-making and taking actions.

Audit trail

Logs inputs, actions, and outputs for transparency and accountability.

Instruction adherence

Evaluates whether responses follow defined rules and topic instructions, with results tracked for review.

Task resolution

Measures whether the user’s task was correctly understood and completed.

Trust is built not just through guardrails, but through design — making AI systems accessible, usable, and understandable from the start.

Trust patterns for safer AI interactions

Trust patterns are reusable design approaches for common AI risks. They help make enterprise AI systems safer, more understandable, and more accountable.

Disclosure of AI-generated content for both internal users and external audiences across all Salesforce AI use cases.

Provide visibility into how a response was generated, including citations, data sources, and relevant context, so users can review and verify outputs.

Define how the system responds when it cannot complete a task, including clear error messages, alternative suggestions, or safe fallback outputs.

Design interactions that encourage review before high-impact actions. Avoid dark patterns and ensure users have a clear opportunity to validate AI-generated content.

Operational safeguards for trusted AI

Design systems that stay aligned with instructions, support responsible human oversight, and reduce misleading or risky behavior across AI experiences.

Evaluate how well agents follow topic instructions, with scoring and explanations to identify when outputs deviate or need refinement.

Introduce human checkpoints for high-impact tasks and provide clear escalation paths. Help users understand when and how to involve a human.

Design voice systems to be clear and not misleading. Ensure performance across languages and accents, and avoid human-like sounds or expressive behaviors that could confuse users.

Design AI to avoid implying emotions, intent, or identity. Systems should not attempt to form emotional bonds or present themselves as human, and should clearly communicate when users are interacting with AI.

Accessibility by design

Trusted AI is accessible AI. At Salesforce, accessibility is built in, not bolted on. Itʼs a core part of how our AI products are designed, developed, and tested. We embed accessibility throughout the software development lifecycle (SDLC) with accountability practices that continuously improve experiences. Explore the various ways in which we embed accessibility by design.

Accessibility throughout the SDLC

3D pink headset icon with a cursor pointer.

Accessible design

Designing with accessibility from the start

Accessibility is built in from the earliest stages of design, incorporating best practices and input from people with disabilities.

3D illustration of three purple cubes arranged together representing modular building blocks.

Design system

Salesforce Lightning Design System (SLDS)

Accessibility is built into SLDS through guidance and reusable components that support consistent, accessible experiences.

3D blue circle icon with a code bracket symbol.

AI & Automation

AI-assisted development and testing

Automated checks identify accessibility issues early and help teams resolve them during development.

3D shield icon with a checkmark symbolizing trust and protection.

Governance and accountability

Continuous monitoring and improvement

Accessibility is validated through conformance reports, audits, and customer-reported issues, measured against WCAG 2.2 AA, with findings resolved and fed back into development.

Transparency we share and trust signals we measure

Model transparency and evaluation

Model cards

Public summaries for Salesforce-owned models that describe what a model is designed to do, where it has limits, and what risks teams should plan for. They also include evaluation highlights.

Browse model cards

Accessibility documentation

Accessibility conformance reports

Standardized reports that document how products meet accessibility standards, including WCAG criteria. They help customers evaluate accessibility and support procurement requirements.

Browse product reports

Model transparency and evaluation

CRM LLM benchmark

A consistent way to evaluate foundation models before they are integrated, using trust-focused signals like Safety, Privacy, Truthfulness, and Fairness.

Browse the benchmark dashboard

Up next

How we govern
AI responsibly

See our governance approach

Building trusted AI FAQs

Trusted AI is built through coordinated practices across design, development, testing, and governance. This includes structured review before launch, real-world testing, built-in guardrails, and ongoing monitoring once systems are live.

AI systems go through multiple layers of testing, including adversarial testing, evaluation against known datasets, and, where applicable, employee-led trust testing. These processes help identify risks, validate performance, and improve system behavior before deployment.

Agents operate within defined constraints, including system policies, subagent controls, and guardrails that limit what tools they can use. These controls help prevent misuse and keep agents aligned with their intended purpose.

Human oversight is built into workflows through review steps, approval gates, and escalation paths. Users can pause, edit, or override actions at any time. See how this works in practice in A day empowered by agents.

Data protections include zero data retention, permission-based data access, and controls that limit how AI workflows use that data. Agents only retrieve data that a user is authorized to access.

Responses are grounded in trusted enterprise data and supported with citations so users can verify outputs. Evaluation systems, benchmarking, and testing frameworks help measure accuracy and identify issues.

Observability tools, audit trails, and evaluation signals provide visibility into system behavior. These systems help teams identify issues, measure performance, and continuously improve AI over time.

Accessibility is integrated across the product lifecycle, from design through testing and release. This includes accessible design systems, automated testing, and validation with people with disabilities to support real-world usability, aligned with standards such as WCAG 2.2 AA.

Explore the resource library for reports, policies, and practical guidance on building and governing trusted AI.

Agentforce

Sales

Service

Marketing

Commerce

Analytics

Slack

Small Business

Data

Agentforce 360 Platform

Net Zero

Customer Success

Partner Apps & Experts

Pricing

Discover the #1 AI CRM

Discover the #1 AI CRM

Automotive

Communications

Engineering, Construction & Real Estate

Consumer Goods

Education

Energy & Utilities

Financial Services

Healthcare

Life Sciences

Manufacturing

Media

Nonprofit

Professional Services

Public Sector

Retail

Technology

Travel, Transportation & Hospitality

Explore Salesforce for industries.

Explore Salesforce for industries.

Customer Stories

Salesforce on Salesforce Stories

Trailblazer Stories

Explore success stories.

Explore success stories.

Dreamforce

TDX

Connections

Tableau Conference

Informatica World

Agentforce World Tours

Salesforce+

More Salesforce Events

Salesforce Events

Salesforce Events

Learning on Trailhead

Try Salesforce for Free

New to Salesforce

Blogs

Resources

Become a Trailblazer.

Become a Trailblazer.

Help & Documentation

Communities

Services & Plans

Account Management

Questions? We can help.

Questions? We can help.

About Salesforce

Our Values

Our Impact

Careers

Newsroom

Legal

More Salesforce Brands

Hear our story.

Hear our story.

Contact Us

Change Region

Americas

Europe, Middle East, and Africa

Asia Pacific

Change Region

Americas

Europe, Middle East, and Africa