time series forecasting Finally Achieves scaling law – Datadog open-sources Toto 2, the Largest Model with 2.5B Parameters
Datadog has released Toto 2, an open-source family of time series forecasting foundation models, avAIlable in five sizes ranging from 4 million to 2.5 billion parameters. Toto 2 is the first model family to empirically validate the scaling law in the time series domain, dEMOnstrating consistent improvements in predictive performance as model size increases, with no signs of saturation even at 2.5B parameters. This breakthrough addresses a long-Standing gap in time series research, where simply scaling up model size, unlike in large language models, had previously failed to deliver reliable gains.
The Toto 2 family includes five variants — 4m, 22m, 313m, 1B, and 2.5B — all released under the Apache 2.0 license. Across major benchmarks, Toto 2 achieves state-of-the-art results on BOOM, GIFT-Eval, and TIME. Beyond accuracy improvements, the model introduces a continuous patch masking mechanism that replaces traditional autoregressive generation with a single forward pass, significantly accelerating inference. As a result, the 313m variant achieves latency comparable to Chronos-2, a 120m-parameter model.
Cross-domain Generalization is another key highlight. Despite being pretrained exclusively on system monitoring metrics and synthetic data — with no public general-purpose time series data — Toto 2 still tops leaderboards covering diverse real-world forecasting tasks. Furthermore, the model demonstrates superior parameter efficiency: the 22m variant outperforms the original Toto 1.0 across all core tests while using only one-seventh the parameters.
Comments & Questions (0)
No comments yet
Be the first to comment!