AI Research

Making AI Smarter for Finance: The FINDAP Framework

Zixuan Ke

Xuan Phi Nguyen

2 additional authors

November 3, 2025 5 min read

What is FINDAP?

The FINDAP Framework is a cutting-edge approach to fine-tune large language models (LLMs) specifically for the finance industry. While LLMs like ChatGPT or Bard are excellent general-purpose tools, specialized domains like finance require models that can understand complex concepts, regulatory constraints, and industry-specific language. FINDAP addresses this by providing a structured, systematic process for training LLMs to meet the unique demands of the financial sector.

FINDAP is Built Around 4 Core Components:

FinCap: Defines the key capabilities needed for success in finance, such as reasoning, financial knowledge recall, and task-specific skills.
FinRec: A training recipe that improves the model’s ability to learn from finance-specific data while ensuring it can follow instructions effectively. It even uses a new method called preference data distillation to make the training process smarter.
FinTrain: A carefully curated set of training datasets that ensures the model learns from high-quality, relevant financial data.
FinEval: A comprehensive evaluation system that tests the model on real-world financial tasks to ensure it’s performing as expected.

Under FINDAP, we developed Llama-Fin, a state-of-the-art financial language model that has achieved groundbreaking results across a wide range of financial tasks.

Why is FINDAP Innovative?

Principle Approach: While other efforts have focused on sequential training or domain adaptation in isolation, FINDAP is the first framework to approach this challenge in a systematic and fine-grained way. It looks at every stage of the process, from defining the right skills to training and testing the model, revealing unique challenges and uncovering effective solutions. This step-by-step methodology sets FINDAP apart, enabling LLMs to reach new levels of specialization and performance.

Forgetting Prevention: one well-known but important challenge when adapting a LLM to specified domain is how to address the “catastrophic forgetting” issue.

Indeed, we observe serious forgetting of general capabilities when performing Continual Pretraining (CPT) and Instruction Tuning (IT) sequentially from an instruction-tuned LLM (specially in instruction-following).

🔀 Our trick: mix CPT and IT data and train them jointly, downsample CPT data to match the size of IT.

🔍Turns out, this not only helps prevent forgetting but also boosts knowledge transfer. Plus, since concepts learned from CPT are often inherently more generalizable due to their shared nature across tasks, combining CPT and IT training improves generalization without requiring exposure to a diverse range of tasks.

Improving reasoning capability: For improvement in reasoning, to balance sparse outcome-based and expensive step-wise rewards, we use a generative reward model (GenRM) with Final Answer Preference (FAP) and Stepwise Corrective Preference (SCP) as follows:

🎯FAP: prompt GenRM to give a holistic judgment for the entire solution using a single ‘Yes’ or ‘No’ token.

🎯SCP: prompt GenRM to identify the first erroneous step and ask it to provide a correction for that step.

Using this correction, we construct preference data for reasoning.

Why Does This Matter?

The finance industry handles high-stakes tasks like investment analysis, regulatory compliance, and market predictions. Current AI models often struggle to adapt effectively to the nuanced and complex nature of financial tasks due to the lack of domain-specific training strategies. The FINDAP framework changes this by teaching models the exact skills and knowledge needed for finance, making them more reliable and capable in these critical applications.

For example, FINDAP allows the Llama-Fin model to:

Summarize financial events with precision.
Answer finance-related questions accurately.
Perform advanced reasoning tasks like CFA-level challenges.
Generalize to completely novel financial tasks it has never seen before.

This adaptability ensures that companies can trust the AI to deliver consistent, high-quality performance across a wide range of use cases.

What Are the Results?

Llama-Fin, developed using FINDAP, has set a new benchmark for financial AI. Here’s what it achieved:

Top-Tier Performance: Llama-Fin outperformed all other models in its size category by 10-25% on tasks similar to its training data. It also beat larger models like GPT-4o and Palmyra-Fin-32K.
Generalization to New Tasks: Even on completely novel financial tasks, Llama-Fin remained competitive, performing better than its base model in 13 out of 17 benchmarks.
Reasoning Excellence: Llama-Fin excelled in reasoning-heavy tasks, achieving up to a 20% improvement in benchmarks like CFA-Challenge.
Preservation of General Knowledge: Despite its specialization, Llama-Fin retained its general knowledge and conversational abilities, ensuring it remains a versatile tool.

These results demonstrate that FINDAP’s systematic approach not only adapts LLMs for finance but also ensures they generalize effectively, making them robust and reliable tools.

Why Does This Matter for Businesses?

For companies like Salesforce and other enterprises, the FINDAP framework offers a way to:

Deliver Better Insights: With finance-specific AI models like Llama-Fin, companies can provide more accurate insights for investment, compliance, and decision-making.
Enhance Customer Support: AI-powered customer service tools can handle finance-related queries with greater accuracy and depth.
Drive Innovation: By adopting FINDAP’s approach, businesses can create their own domain-specific models tailored to their unique industries.
Increase Trust in AI: Models that perform reliably in high-stakes domains build confidence among customers, regulators, and stakeholders.

Looking Ahead

The success of FINDAP and Llama-Fin is just the beginning. The framework’s systematic approach can be applied to other specialized domains, from healthcare to legal services. By refining and extending FINDAP, we aim to unlock the potential of AI to solve complex, industry-specific challenges and redefine how businesses leverage technology in their operations.

For companies, researchers, and AI enthusiasts, FINDAP represents a leap forward in creating smarter, more capable AI systems that can handle the complexities of the real world with precision and reliability.

Resources

Paper (EMNLP2025 Oral, with ARR Best Paper nomination): https://arxiv.org/abs/2501.04961
Project: https://vincent950129.github.io/adapt-llm/
Model: https://huggingface.co/Salesforce/Llama-Fin-8b
Training Data: https://huggingface.co/datasets/Salesforce/FinTrain
Evaluation Data: https://huggingface.co/datasets/Salesforce/FinEval
Code: https://github.com/SalesforceAIResearch/FinDAP
Salesforce AI Research Website

Illustration of vibe coding showing a few icons and a coding matrix with AI.

Vibe Coding Tips For Startups: How To Build Your Product Faster

5 min read

Introducing eVerse: Enterprise Simulation Environments to Train AI Agents

4 min read

Zixuan Ke Research Scientist

More by Zixuan

Xuan Phi Nguyen

More by Xuan Phi

Shafiq Joty Senior Director, Research

Shafiq (raihanjoty.github.io) directs the NLP group's work on large language modeling (LLM) and generative AI. Some of his group's recent projects include SFR-RAG, SFR-Judge, SFR-RAG-Agent and xGen. He is also a tenured Associate Professor (currently on leave) in the School of Computer Science and Read More

More by Shafiq

Caiming Xiong SVP Salesforce Research

More by Caiming

Making AI Smarter for Finance: The FINDAP Framework

Zixuan Ke

Xuan Phi Nguyen

2 additional authors

What is FINDAP?

FINDAP is Built Around 4 Core Components:

Why is FINDAP Innovative?

Why Does This Matter?

What Are the Results?

Why Does This Matter for Businesses?

Looking Ahead

Resources

Just For You

Vibe Coding Tips For Startups: How To Build Your Product Faster

Introducing eVerse: Enterprise Simulation Environments to Train AI Agents

Just For You

Measuring Unpredictable AI: What Business Leaders Need to Know

Does your AI Strategy Include Sustainability? Here’s Why It Should

Better LLM Agents for CRM Tasks: Tips and Tricks

The Agentic AI Era: After the Dawn, Here’s What to Expect

Beyond the Chat Window: How Computer Use Agents Are Learning to Click, Scroll, and Work

BFCL Audio: A Benchmark for Audio-Native Function Calling

MCP-Universe: A Comprehensive Framework for AI Agent Development and Benchmarking

Why You Shouldn’t Be Scared of Digital Labor For Your Startup or SMB

Share article

What is FINDAP?

FINDAP is Built Around 4 Core Components:

Why is FINDAP Innovative?

Why Does This Matter?

What Are the Results?

Why Does This Matter for Businesses?

Looking Ahead

Resources

Share article

Explore related content by topic

Get the latest articles in your inbox.

360 Highlights

IT

Commerce

Marketing

Service

Sales

Thanks, you're subscribed!