What is FINDAP?
The FINDAP Framework is a cutting-edge approach to fine-tune large language models (LLMs) specifically for the finance industry. While LLMs like ChatGPT or Bard are excellent general-purpose tools, specialized domains like finance require models that can understand complex concepts, regulatory constraints, and industry-specific language. FINDAP addresses this by providing a structured, systematic process for training LLMs to meet the unique demands of the financial sector.

FINDAP is Built Around 4 Core Components:
- FinCap: Defines the key capabilities needed for success in finance, such as reasoning, financial knowledge recall, and task-specific skills.
- FinRec: A training recipe that improves the model’s ability to learn from finance-specific data while ensuring it can follow instructions effectively. It even uses a new method called preference data distillation to make the training process smarter.
- FinTrain: A carefully curated set of training datasets that ensures the model learns from high-quality, relevant financial data.
- FinEval: A comprehensive evaluation system that tests the model on real-world financial tasks to ensure it’s performing as expected.
Under FINDAP, we developed Llama-Fin, a state-of-the-art financial language model that has achieved groundbreaking results across a wide range of financial tasks.
Why is FINDAP Innovative?
Principle Approach: While other efforts have focused on sequential training or domain adaptation in isolation, FINDAP is the first framework to approach this challenge in a systematic and fine-grained way. It looks at every stage of the process, from defining the right skills to training and testing the model, revealing unique challenges and uncovering effective solutions. This step-by-step methodology sets FINDAP apart, enabling LLMs to reach new levels of specialization and performance.
Forgetting Prevention: one well-known but important challenge when adapting a LLM to specified domain is how to address the “catastrophic forgetting” issue.
Indeed, we observe serious forgetting of general capabilities when performing Continual Pretraining (CPT) and Instruction Tuning (IT) sequentially from an instruction-tuned LLM (specially in instruction-following).
🔀 Our trick: mix CPT and IT data and train them jointly, downsample CPT data to match the size of IT.
🔍Turns out, this not only helps prevent forgetting but also boosts knowledge transfer. Plus, since concepts learned from CPT are often inherently more generalizable due to their shared nature across tasks, combining CPT and IT training improves generalization without requiring exposure to a diverse range of tasks.
Improving reasoning capability: For improvement in reasoning, to balance sparse outcome-based and expensive step-wise rewards, we use a generative reward model (GenRM) with Final Answer Preference (FAP) and Stepwise Corrective Preference (SCP) as follows:
🎯FAP: prompt GenRM to give a holistic judgment for the entire solution using a single ‘Yes’ or ‘No’ token.
🎯SCP: prompt GenRM to identify the first erroneous step and ask it to provide a correction for that step.
Using this correction, we construct preference data for reasoning.
Why Does This Matter?
The finance industry handles high-stakes tasks like investment analysis, regulatory compliance, and market predictions. Current AI models often struggle to adapt effectively to the nuanced and complex nature of financial tasks due to the lack of domain-specific training strategies. The FINDAP framework changes this by teaching models the exact skills and knowledge needed for finance, making them more reliable and capable in these critical applications.
For example, FINDAP allows the Llama-Fin model to:
- Summarize financial events with precision.
- Answer finance-related questions accurately.
- Perform advanced reasoning tasks like CFA-level challenges.
- Generalize to completely novel financial tasks it has never seen before.
This adaptability ensures that companies can trust the AI to deliver consistent, high-quality performance across a wide range of use cases.
What Are the Results?
Llama-Fin, developed using FINDAP, has set a new benchmark for financial AI. Here’s what it achieved:
- Top-Tier Performance: Llama-Fin outperformed all other models in its size category by 10-25% on tasks similar to its training data. It also beat larger models like GPT-4o and Palmyra-Fin-32K.
- Generalization to New Tasks: Even on completely novel financial tasks, Llama-Fin remained competitive, performing better than its base model in 13 out of 17 benchmarks.
- Reasoning Excellence: Llama-Fin excelled in reasoning-heavy tasks, achieving up to a 20% improvement in benchmarks like CFA-Challenge.
- Preservation of General Knowledge: Despite its specialization, Llama-Fin retained its general knowledge and conversational abilities, ensuring it remains a versatile tool.
These results demonstrate that FINDAP’s systematic approach not only adapts LLMs for finance but also ensures they generalize effectively, making them robust and reliable tools.
Why Does This Matter for Businesses?
For companies like Salesforce and other enterprises, the FINDAP framework offers a way to:
- Deliver Better Insights: With finance-specific AI models like Llama-Fin, companies can provide more accurate insights for investment, compliance, and decision-making.
- Enhance Customer Support: AI-powered customer service tools can handle finance-related queries with greater accuracy and depth.
- Drive Innovation: By adopting FINDAP’s approach, businesses can create their own domain-specific models tailored to their unique industries.
- Increase Trust in AI: Models that perform reliably in high-stakes domains build confidence among customers, regulators, and stakeholders.
Looking Ahead
The success of FINDAP and Llama-Fin is just the beginning. The framework’s systematic approach can be applied to other specialized domains, from healthcare to legal services. By refining and extending FINDAP, we aim to unlock the potential of AI to solve complex, industry-specific challenges and redefine how businesses leverage technology in their operations.
For companies, researchers, and AI enthusiasts, FINDAP represents a leap forward in creating smarter, more capable AI systems that can handle the complexities of the real world with precision and reliability.
Resources
- Paper (EMNLP2025 Oral, with ARR Best Paper nomination): https://arxiv.org/abs/2501.04961
- Project: https://vincent950129.github.io/adapt-llm/
- Model: https://huggingface.co/Salesforce/Llama-Fin-8b
- Training Data: https://huggingface.co/datasets/Salesforce/FinTrain
- Evaluation Data: https://huggingface.co/datasets/Salesforce/FinEval
- Code: https://github.com/SalesforceAIResearch/FinDAP
- Salesforce AI Research Website
Follow us on X: @SFResearch














