AI Research

Fine-Tuning Time-Series Foundation Models on Business Data

Kyle Gilson

Piyush Lakhawat

3 additional authors

January 8, 2026 4 min read

Why Forecasting Matters—and How to Unlock More from Foundation Models

Forecasting is critical to how many large organizations, including Salesforce, manage their global cloud infrastructure.. Reliable projections of compute, storage, usage, and cost help keep services available while controlling cloud infrastructure spend. At Salesforce, these patterns shift constantly as new regions launch, customers upgrade to Hyperforce, and usage evolves—making accurate forecasting both essential and difficult.

The rise of generative AI showed the power of large language models trained on vast volumes of text. Time-series foundation models use similar architectures but learn patterns that occur over time instead of language—trends, seasonality, bursts, long-range dependencies, and multivariate structure. The Moirai family of models, built by Salesforce AI Research, brought this concept to forecasting by offering “universal” capabilities across diverse time-series domains.

But just as LLMs often require domain-specific fine-tuning and contextual grounding, we found the same applies to time-series models. Moirai’s broad pre-training provides a strong foundation, but operational-grade accuracy requires adapting the model to the distinctive patterns of Salesforce’s CloudOps environment. Foundation models are excellent generalists—but enterprise forecasting requires specialization.

Why Fine-Tuning Matters: Business Data Makes the Difference

To achieve that specialization, the Infra Data Science team partnered with AI Research to fine-tune Moirai on Salesforce’s internal telemetry. We combined Moirai’s general capabilities with our own CloudOps signals—so the model could learn how Salesforce infrastructure behaves in practice.

This scale is precisely what makes fine-tuning both possible and meaningful. Salesforce operates across hundreds of thousands of customers, millions of daily users, thousands of infrastructure services, and dozens of global regions, each emitting metrics ranging from DAU and request volumes to CPU, storage, I/O, and cost. Multiply these dimensions together and the result is millions of time series with rich, diverse temporal patterns—exactly the kind of environment in which foundation models can learn enterprise-specific behavior at scale.

Salesforce’s systems have specific rhythms that generic time series datasets don’t capture: release cycles, holiday patterns, cross-cloud nuances, regional growth curves, workload migration waves, and occasional anomalies. By fine-tuning Moirai on these patterns, we observed more accurate forecasts, better-calibrated uncertainty estimates, and more stable performance across shifting workloads. And given the scale of our cloud footprint, even a 0.5 percent improvement in accuracy can translate into millions of dollars in operational impact. The fine-tuned model remained general enough to support many services and regions, but with the specificity needed for our operational context.

Implementation Overview

We first curate a comprehensive internal CloudOps training dataset covering compute and storage usage, customer behavior, infra spending trends, and regional patterns. The data set included 80+ metrics across 2M+ entities and included 1.3B+ time steps. This became the foundation for fine-tuning Moirai 1.0.

Training included holdout segments representing upcoming migrations and new services to evaluate real-world generalization. We benchmarked the fine-tuned model against Moirai 1.0 out-of-the-box, classical statistical models, machine-learning baselines, and our existing forecasting implementations, comparing performance across short-, medium-, and long-term forecasting horizons.After validating the improvements, we integrated Moirai fine-tuned into our configuration-driven forecasting platform. Because onboarding is YAML-based, adoption required no new pipelines—new services could immediately use the model, with drift monitoring handled by the platform’s existing infrastructure.

Outcomes & Impact

While this work is still early, our initial evaluations show encouraging results. In Table 1, the fine-tuned Moirai models achieved better performances than its public-release counterparts in both the point forecast and quantile forecast scenarios. This improvement translated into more reliable forecasts across many of our test workloads, especially those with rapidly shifting or noisy behaviors. Moreover, finetuned Moirai also delivered a better probabilistic forecast (lower MSIS and CRPS), which is an important step toward more reliable planning and cost analyses.

Model	Finetune	MASE	MAPE	MSIS	CRPS
Moirai 1.1 Large	No	0.853	0.864	0.541	0.336
Moirai 1.1 Large	Yes	0.731	0.725	0.419	0.291
Moirai 1.1 Base	No	0.878	0.894	0.483	0.352
Moirai 1.1 Base	Yes	0.770	0.769	0.462	0.302

Table 1: Performance comparison of Moirai models with and without fine tuning on our internal benchmark. Lower is better, improvement results are highlighted in bold.

Because Moirai is integrated into our configuration-driven forecasting platform, the fine-tuned model can be used for new internal forecasting use cases and reduce long-term model maintenance. As we continue testing and expand adoption, we expect to learn more about how fine-tuned foundational time series models will translate into day-to-day operational value.

Why This Matters — and What Comes Next

Our work shows that foundation models are powerful, but their full value emerges when adapted to business-specific time-series patterns. Forecasting depends on temporal structure, seasonal effects, and operational context that generic models can’t fully capture. Fine-tuning embeds a company’s real rhythms into the model, turning a universal architecture into a scalable forecasting backbone.

As Salesforce’s infrastructure evolves, our modeling approach will evolve with it. We’re exploring multi-modal forecasting with logs and deployment metadata, and building explainability tools to clarify why forecasts shift.

From Moirai 1.0 to a fine-tuned, Salesforce-specific engine, the pattern is clear: generic models help you start, but enterprise-grade accuracy comes from training on your own data. As foundation models become a core part of forecasting workflows, organizations across every domain will find that competitive advantage comes not from the architecture alone, but from how effectively those models learn the unique rhythms of their own business. We look forward to continuing this work and sharing what we learn as the field advances.

Kyle Gilson Senior Director, Infrastructure Data Science

More by Kyle

Piyush Lakhawat Software Engineering, SMTS

More by Piyush

Hong-Quang Pham Senior Research Scientist

Quang Pham is a Senior Research Scientist at Salesforce AI Research. His research focuses on multimodal foundation model for time series intelligence such as forecasting, contextual forecasting, and anomaly detection. He received his PhD degree in Computer Science from Singapore Management Read More

More by Hong-Quang

Doyen Sahoo Director, Research

Doyen Sahoo is the Director, of Salesforce AI Research Asia. Doyen leads several projects pertaining to AI for IT Operations or AIOps, AI for Software, and Time-Series intelligence – working on both fundamental and applied research.

More by Doyen

Junnan Li Director, AI Research Singapore

Junnan Li is a Research Director at Salesforce. He joined Salesforce in 2019 as the founding researcher of the Singapore AI research team. In 2024, he co-founded Rhymes.ai as the chief scientist, which was soft-acquired by Salesforce AI Research in 2025. Junnan is an expert in multimodal AI, LLMs, Read More

More by Junnan