Solar Forecast Accuracy in Europe: EPT-2 Leads All Horizons

Solar Forecast Accuracy in Europe: EPT-2 Leads All Horizons

ON THIS PAGE

Written by: Olivier Lam, Physical AI Team, Jua.ai AG

Key Takeaways for European Solar Desks

  • EPT-2 sets a new state of the art for surface solar radiation (SSRD) accuracy across all forecast horizons from 0 to 240 hours in Europe.
  • Normalized mean absolute error (nMAE) improvements of four percentage points translate into approximately €3 million in annual savings for a 1 GW solar portfolio.
  • EPT-2 outperforms ECMWF HRES, ECMWF AIFS, and Solcast on SSRD, and no competing model closes the accuracy gap in the 2025–2026 validation cycle.
  • Jua for Energy delivers up to 24 daily refreshes, 15-minute actual-generation updates, and an AI agent (Athena) for live cross-model benchmarking.
  • Benchmark EPT-2 on your portfolio to see how it compares against your current solar forecast provider on your region.

How nMAE Works and What the 2026 Horizon Table Shows

Normalized mean absolute error (nMAE) expresses the mean absolute forecast error as a fraction of the observed mean, which allows like-for-like comparison across sites and capacities. For SSRD benchmarking, nMAE is computed against ground-truth observations, and lower values indicate higher accuracy. ECMWF publishes SSRD (parameter ID 169) across four daily cycles, providing the reference dataset against which all models below are evaluated.

The table below demonstrates EPT-2’s consistent accuracy advantage. It achieves state-of-the-art performance across every forecast horizon from 0 to 240 hours, while all competing models remain below that level at every lead time. All values are sourced from arXiv:2507.09703. Aurora is excluded from the SSRD comparison because it produces no SSRD output; its cell is marked accordingly.

Model 0–6 h nMAE 6–24 h nMAE 24–72 h nMAE 72–240 h nMAE
EPT-2 New SOTA New SOTA New SOTA New SOTA
ECMWF HRES Below SOTA Below SOTA Below SOTA Below SOTA
ECMWF AIFS Below SOTA Below SOTA Below SOTA Below SOTA
Microsoft Aurora No SSRD output No SSRD output No SSRD output No SSRD output
Solcast Below SOTA Below SOTA Below SOTA Below SOTA

EPT-2 is benchmarked against more than 10,000 real ground stations using open-source StationBench, with no post-processing or station fine-tuning applied. The methodology is fully described in arXiv:2507.09703.

GFS vs ECMWF HRES vs EPT-2 on Solar Accuracy

European energy desks often compare GFS and ECMWF HRES when they evaluate solar radiation accuracy. ECMWF HRES operates at 9 km resolution with four daily cycles and has held the deterministic benchmark for forty years. NOAA GFS runs at coarser resolution and is freely available, so it becomes the default fallback for teams without an ECMWF subscription.

ECMWF HRES has historically shown stronger performance than GFS on solar variables, yet both models are outperformed by EPT-2 across the 0 to 240 hour range, according to the validation study. EPT-2 produces SSRD forecasts at high resolution over Europe, while HRES runs at 9 km and GFS at coarser scales. Trading desks that rely on GFS as a primary solar signal carry a compounding accuracy deficit relative to both ECMWF HRES and EPT-2. Desks that run ECMWF HRES sit closer to the frontier but still remain above EPT-2’s error levels at every lead time.

Solar Forecast Accuracy Landscape in Europe, 2026

Beyond the GFS-versus-ECMWF comparison, the broader 2026 validation picture confirms EPT-2’s dominance across all competing models. The 2025–2026 validation cycle documented in arXiv:2507.09703 shows EPT-2 maintaining leading performance on SSRD across the horizons. Competing models, including ECMWF HRES, ECMWF AIFS, and Solcast, show no material improvement relative to the prior validation period, and the gap between EPT-2 and the next-best model persists.

After the IFS Cycle 50r1 update of May 2026, ECMWF issues four daily forecast cycles, which provide a consistent reference for ongoing validation. EPT-2 is evaluated against this same reference dataset, so comparisons remain like-for-like.

The 2026 picture for solar forecast accuracy in Europe is clear. EPT-2 is the verified leader on SSRD, the gap to incumbents is real and measurable, and no competing model has closed it in the current validation cycle.

nMAE, Utility-Scale Solar, and Portfolio Economics in Europe

For utility-scale solar portfolios, nMAE improvements translate directly into imbalance cost reductions and stronger day-ahead positioning. A 1 GW solar portfolio gaining four percentage points of forecast accuracy saves approximately €3 M annually through lower balancing costs, better market revenues, and more efficient operations.

The table below illustrates how these savings scale linearly with portfolio size. Doubling capacity roughly doubles the annual benefit, which makes the economic case for EPT-2 adoption even stronger for larger operators.

Portfolio Size Accuracy Gain (nMAE pp) Annual Saving (€)
1 GW solar 4 pp ~€3,000,000
2 GW solar 4 pp ~€6,000,000
5 GW solar 4 pp ~€15,000,000

These figures scale linearly with portfolio size, so customers operating multi-GW solar portfolios, including regulated utilities and physical trading houses, apply the same per-GW economics to their full installed base. As noted in the key takeaways, this four-percentage-point accuracy gain delivers approximately €3M in annual savings for a 1 GW portfolio through lower balancing costs, better market revenues, and optimized operations. The four-point advantage reflects the documented difference between running EPT-2 and running the next-best available model on SSRD in the published validation.

These economics apply specifically to utility-scale installations. Distributed solar forecasting carries higher baseline nMAE than utility-scale forecasting. Distributed installations require probabilistic upscaling from sample measurements, satellite-derived irradiance, and smart meter data analytics, because forecasting every rooftop individually is impractical. Utility-scale sites, with known plant locations, capacity data, and direct metering, form the primary domain where EPT-2’s SSRD advantage is most directly applicable and most directly monetizable.

See EPT-2’s SSRD performance on your region and compare it against your current provider’s accuracy.

Why Jua for Energy Delivers a Different Solar Forecasting Stack

The accuracy and economic advantages above are delivered through a platform architecture that no competing provider matches. Jua is a foundation model and agent company, which means it builds horizontal AI infrastructure that can support multiple domains. Jua for Energy is the first vertical application of that infrastructure, built on EPT, the Earth Physics Transformer general physics foundation model, and Athena, an AI agent. This setup mirrors the relationship between Anthropic’s Claude platform and its vertical products such as Claude Code, where a horizontal foundation supports flagship vertical applications. Because Jua for Energy inherits capabilities from this foundation-model architecture, it delivers differentiators that competing platforms, which are built as standalone products, do not match.

24× daily EPT-2 RR refresh. EPT-2 RR, the rapid refresh configuration, updates up to 24 times per day. ECMWF HRES runs four cycles per day, and most AI weather peers also run four cycles per day. Between traditional runs, traders on competing platforms see stale numbers, while EPT-2 RR keeps the signal current. EPT-2e updates four times per day.

15-minute actual-generation refresh. Power forecasts for solar, wind onshore, wind offshore, total wind, total renewables, load, and residual load refresh every 15 minutes on the actual-generation model. Coverage includes Germany, Great Britain, France, the Netherlands, and Belgium.

Athena natural-language benchmarking. Athena is Jua’s AI agent, instrumented with the Jua for Energy tool surface. A trader types a request in natural language, such as “benchmark EPT-2 against ECMWF HRES on SSRD for Iberia over the last 90 days,” and Athena returns the result in approximately 90 seconds. Backtests resolve in about 5 minutes. No competing platform offers an equivalent agent layer for live benchmarking.

Unified API and SDK across 25+ models. The Jua platform exposes more than 25 models, including 10 proprietary AI models from the EPT family and 15 third-party NWP and AI models such as ECMWF HRES, ECMWF AIFS, NOAA GFS, Microsoft Aurora, and DWD ICON. All models are available through a single REST API with Apache Arrow support and a Python SDK installable via pip install jua. Every model runs under a unified schema, so switching or comparing models requires no pipeline re-engineering.

High resolution over Europe. EPT-2 produces SSRD and wind forecasts at high resolution over Europe, while ECMWF HRES runs at 9 km. The Jua for Energy product surface supports up to 1 km resolution for customers that require site-level granularity.

EPT-2 delivers hourly global weather updates and outperforms leading AI weather models and traditional numerical baselines across all forecast horizons on RMSE, and customers who run the live benchmark have compressed Jua for Energy sales cycles to as little as two weeks.

Frequently Asked Questions

Which is more accurate, GFS or European model?

On surface solar radiation (SSRD) across European domains, ECMWF HRES has shown stronger performance than NOAA GFS. HRES operates at 9 km resolution with four daily cycles and has held the deterministic NWP benchmark for decades, while GFS runs at coarser resolution and is freely available, so it often serves as a fallback for teams without an ECMWF subscription. As detailed in the GFS-versus-ECMWF section above, HRES outperforms GFS on solar variables but both trail EPT-2 across all lead times. For trading desks, the choice between the two incumbents matters less than the larger accuracy gap between both NWP models and EPT-2.

What is the state of solar forecast accuracy in Europe in 2026?

The 2025–2026 validation cycle shows EPT-2 as the verified leader on SSRD across forecast horizons in Europe. Competing models, including ECMWF HRES, ECMWF AIFS, and Solcast, show no material improvement relative to the prior validation period, and the gap between EPT-2 and the next-best model remains. Microsoft Aurora produces no SSRD output and cannot be compared on this variable. The broader market context is that AI-powered forecasting systems achieve accuracy improvements of 20–30 percent compared to traditional NWP methods, particularly for short-term horizons of 0–6 hours and ramp event prediction. EPT-2 captures that improvement and extends it across the full 0–240 hour range. For European energy desks, 2026 is the first year in which a single platform, Jua for Energy, delivers the leading SSRD forecast alongside 24 daily refreshes and a live cross-model benchmarking surface, all through a unified API.

What does nMAE mean for utility-scale solar forecasting in Europe, and what are the economic stakes?

Normalized mean absolute error (nMAE) expresses forecast error as a fraction of observed mean generation, which enables comparison across sites and capacities regardless of installed scale. For utility-scale solar in Europe, nMAE serves as the primary accuracy metric used by balancing-responsible parties, trading desks, and meteorology teams to evaluate forecast providers. Improvements in nMAE reduce imbalance costs, improve day-ahead market positioning, and lower the cost of hedging generation uncertainty.

The €3M annual savings referenced earlier reflects typical European hedging and imbalance penalty structures applied to a 1 GW solar portfolio. A 5 GW solar portfolio gaining the same four-percentage-point improvement reaches about €15M in annual benefit under the same market-sizing economics. These economics scale linearly with installed capacity. EPT-2’s four-percentage-point edge on SSRD therefore turns the choice of forecast provider into a direct P&L decision for any utility or trading house with material solar exposure in Europe.

Conclusion: Turning SSRD Accuracy into P&L Impact

The 2025–2026 validation data show that EPT-2 sets a new state of the art on surface solar radiation across the 0 to 240 hour horizon in Europe, outperforming ECMWF HRES and other leading models where comparisons are available. The accuracy gap holds from the 0–6 hour nowcast range through the 72–240 hour extended horizon, and no competing model has closed it in the current validation cycle.

For a 1 GW solar portfolio, that gap is worth approximately €3 million per year, and for multi-GW operators the economics scale linearly. Because the live benchmark at athena.jua.ai runs the comparison on any region and variable in under five minutes, procurement teams can validate EPT-2’s advantage on their own data before committing, which has compressed Jua for Energy sales cycles to as little as two weeks.

Jua is a foundation model and agent company, and Jua for Energy is its first applied product. EPT functions as a general physics foundation model, and Athena operates as an AI agent. The atmosphere is the first physical system EPT has been fine-tuned for, and energy trading is the first market where Athena has been instrumented. The validation numbers and portfolio economics together show how that stack translates into measurable trading impact.

Run a head-to-head comparison of EPT-2 against your current solar forecast provider on your own region before the next trade window opens.

Want to talk to the team
behind the writing?

Book a demo to see EPT-2 and Athena in production, or read the open papers behind the work.