Written by: Olivier Lam, Physical AI Team, Jua.ai AG
Key Takeaways
- EPT-2 outperforms ECMWF HRES and Microsoft Aurora on 10 m wind, 100 m wind, 2 m temperature, and surface solar radiation across all 0–240 hour lead times on open-source StationBench.
- Legacy NWP systems are limited to 2–4 daily updates and cost €1,000–€20,000 per run; EPT-2 delivers up to 24 updates per day at a fraction of the cost on a single GPU.
- Jua for Energy replaces brittle in-house pipelines and manual benchmarking with a unified 25-model platform, 15-minute power-forecast refresh, and the Athena AI agent for natural-language briefings and backtests.
- Customers such as Axpo, TotalEnergies, and Statkraft already use Jua for Energy to compress morning prep into a single workspace and capture multi-million-euro accuracy gains on GW-scale portfolios.
- Book a demo with Jua to run a live head-to-head benchmark on your region and variables in under five minutes.
The Cost of Legacy NWP in 2026 Energy Markets
Numerical weather prediction (NWP) has governed energy forecasting for forty years. The physics work. The economics do not scale. A single NWP simulation consumes approximately 8,400 kWh and costs €1,000–€20,000 to run on high-performance computing infrastructure. That compute ceiling limits ECMWF HRES and equivalent systems to two to four global updates per 24 hours. Between runs, traders operate on stale numbers.
The workflow built on top of those runs compounds the problem. Energy desks start the day at 6 a.m. downloading raw grib files and pushing them through brittle in-house pipelines. Teams cross-reference internal meteorology groups or paid consultancies, then stitch together spreadsheets and terminal screens. By the time a coherent view of the day exists, the market has already moved.
Research-grade AI weather models such as Microsoft Aurora, Google DeepMind GraphCast, and ECMWF AIFS remove the compute ceiling but introduce a different gap. They arrive as raw model outputs without productised ensembles, operational refresh schedules, hindcast archives, or workflow automation. Quant teams that subscribe to these outputs must build the ingestion pipeline, the ensemble logic, the benchmarking harness, and the hindcast access themselves. Enterprise renewable-energy operations require quarter-hourly or finer update intervals, a cadence no research-grade AI model delivers as a productised service. Point-solution SaaS vendors resell processed NWP without ensembles, benchmarking, or agent tooling. Meteorology consultancies deliver analyst reports the morning after the trade window has closed.
The result is a fragmented, lagging stack assembled from a dozen contracts, in an industry where the unit of profit is the gigawatt-hour and the unit of risk is the forecast miss.
The Solution: Jua for Energy as Unified Forecasting Stack
Jua for Energy addresses this fragmentation by unifying the entire workflow, from raw forecasts to trading decisions, in a single platform. Jua is a foundation model and agent company. The relationship to Jua for Energy mirrors Anthropic and Claude Code: a horizontal AI platform with a flagship vertical product. The Earth Physics Transformer (EPT) family is a general spatiotemporal transformer foundation model that learns the governing physics of complex systems such as conservation of mass, momentum, and energy directly from observational data. It encodes these laws in a latent representation integrated forward in time. Athena is Jua’s AI agent. It plans, calls tools, and turns natural-language objectives into briefings, benchmarks, backtests, and custom widgets. The architecture is domain-agnostic. The atmosphere is the first physical system EPT has been fine-tuned for. Energy trading is the first market Athena has been instrumented for.
Jua for Energy does not replace ECMWF. It displaces the plumbing around it. Serious customers keep their ECMWF subscription. Jua for Energy replaces the in-house grib pipeline, the manual benchmarking, the morning-briefing analyst, and the dashboard stitching. ECMWF AIFS runs on the Jua platform alongside EPT-2 under a unified schema and a single API. The 7–9 a.m. manual prep routine becomes a single workspace that is open before the market.
Customers include Axpo, TotalEnergies, Statkraft, EnBW, EDF, and Hydro-Québec. Regulated utilities, physical trading houses, and quant funds across five continents execute daily trading decisions on the platform.
Head-to-Head Accuracy: EPT-2 vs ECMWF HRES and Aurora
EPT-2 outperforms ECMWF HRES on every lead time across the full 0–240 hour range on all four variables that drive an energy P&L: 10 m wind speed, 100 m wind speed, 2 m temperature, and surface solar radiation (SSRD). The evaluation methodology is open-source StationBench, measured against more than 10,000 real ground stations with no post-processing or station fine-tuning. This is the same methodology documented in arXiv:2410.15076 for EPT-1.5.
Against Microsoft Aurora, EPT-2 beats Aurora on 10 m wind, 100 m wind, and 2 m temperature across the full 0–240 hour range. On surface solar radiation, EPT-2 wins by default because Aurora produces no SSRD output. EPT-2 inference runs approximately 25% faster than Aurora. EPT-2 forecasts at arbitrary lead times natively, while Aurora rolls forward in fixed 6-hour increments, which compounds error at longer horizons.
On probabilistic skill, EPT-2e, the ensemble variant, beats the 50-member ECMWF ENS mean on both RMSE and CRPS at virtually every lead time with 30 members. No AI weather peer ships a comparable productised ensemble. EPT-2e updates four times per day.
EPT-1.5, the previous-generation model documented in arXiv:2410.15076, outperforms GraphCast, FuXi, Pangu-Weather, and ECMWF HRES on European wind and temperature. EPT-2 extends this benchmark trajectory.
Operational Economics and Faster Update Cadence
A single EPT-2 inference runs on a single GPU in minutes. This cost sits roughly four orders of magnitude below the €1,000–€20,000 and ~8,400 kWh consumed by a traditional NWP run. EPT-2 was trained on 8 × H100 GPUs over 10 days. Microsoft Aurora required 32 × A100 GPUs over 18 days. The cost asymmetry at training and inference enables a refresh cadence that NWP economics cannot match.
EPT-2 RR updates up to 24 times per day, which gives traders fresh data at hourly intervals instead of waiting for the 6-hour NWP cycle. For higher-resolution regional needs, EPT-2 HRRR delivers the same hourly cadence at up to 5 km spatial resolution over Europe. On top of these weather forecasts, actual-generation power forecasts refresh every 15 minutes so traders always work with current grid conditions. This speed advantage is structural. A typical Jua run completes approximately 2.5 hours ahead of competing operational runs at the same cycle, so customers see the next forecast before the next traditional run lands.
The market-sizing economics are direct. A 1 GW wind portfolio that gains four percentage points of forecast accuracy saves approximately €1.5 M per year under typical hedging and imbalance-penalty structures. A 1 GW solar portfolio at the same accuracy gain saves approximately €3 M per year. Multi-GW portfolios scale these figures linearly.
Book a demo to quantify the accuracy gain on your portfolio in under five minutes.
Power-Forecast Coverage and Athena Agent Workflow
Jua for Energy delivers live power forecasts for solar, wind onshore, wind offshore, total wind, total renewables, load, and residual load across Germany, Great Britain, France, the Netherlands, and Belgium. Two models run on the same surface. A Fundamental Model combines EPT weather forecasts with installed-capacity data and runs out to 20 days. An Actual Generation Model refreshes every 15 minutes with a 48-hour horizon. Coverage expands on a weekly basis.
Athena, Jua’s AI agent instrumented with the Jua for Energy tool surface, turns a natural-language request into a briefing, a benchmark, a backtest, or a custom widget. A typical query resolves in approximately 90 seconds. A backtest runs in approximately 5 minutes. Trading houses and quant desks describe Athena as another headcount at no incremental cost. Divergence alerts fire the moment two models disagree on a key variable. Correction alerts fire the moment a model revises its own output. Both alert types surface trade windows as they open, before the market reprices.
The industry shift toward sub-hourly forecast updates and probabilistic outputs is already underway. Jua for Energy delivers both natively, without requiring the customer to build the pipeline.
Integration and Developer Experience for Quant Teams
Jua exposes more than 25 models through a REST API, including POST /v1/forecast/data and related endpoints, with Apache Arrow support for large payloads. The official Python SDK installs via pip install jua from PyPI and provides forecast access, hindcast and backtesting, and weather-parameter standardisation across all models. ENTSO-E grid-data integration is available for European power-market data. API documentation is at query.jua.ai/docs. The developer dashboard lives at developer.jua.ai.
Hindcast data is available across multiple Jua and third-party models for backtesting. Integration that takes a quant team a quarter to build elsewhere stands up in days. The 25-model platform, which includes 10 proprietary EPT-family models plus 15 third-party NWP and AI models such as ECMWF HRES, ENS, AIFS, NOAA GFS, DWD ICON, Aurora, and GraphCast, runs under a unified schema. Swapping or comparing models requires no pipeline re-engineering.
Comparison: Jua for Energy vs Aurora, SAS, DNV, ECMWF HRES/ENS
| Capability | Jua for Energy (EPT-2 / EPT-2e / Athena) | Microsoft Aurora | SAS Energy Forecasting | DNV Forecasting | ECMWF HRES / ENS |
|---|---|---|---|---|---|
| Deterministic skill vs HRES (10 m wind, 100 m wind, 2 m temp, SSRD, 0–240 h) | EPT-2 beats HRES across all lead times and variables | Loses to EPT-2 on 10 m wind, 100 m wind, 2 m temp across 0–240 h; no SSRD output | Resells processed NWP; no published StationBench result | Resells processed NWP; no published StationBench result | The 40-year benchmark; trails EPT-2 on all four variables |
| Ensemble / probabilistic skill | EPT-2e (30 members) beats 50-member ECMWF ENS mean on RMSE and CRPS at virtually every lead time | No productised ensemble | No published ensemble benchmark | No published ensemble benchmark | ENS: 50-member gold standard for probabilistic NWP |
| Update cadence | Up to 24×/day (EPT-2 RR); 15-min actual-generation refresh; EPT-2e 4×/day | Typically 4×/day; no productised operational schedule | Dependent on underlying NWP; typically 2–4×/day | Dependent on underlying NWP; typically 2–4×/day | 2–4×/day |
| Inference cost per simulation | Single GPU, minutes | Similar order of magnitude to Jua for inference | Passes through NWP cost; no published figure | Passes through NWP cost; no published figure | ~8,400 kWh, €1,000–€20,000, HPC, 1–2 hours |
| Power-forecast surface | Solar, wind on/offshore, load, residual load; DE, GB, FR, NL, BE; 15-min refresh; 20-day horizon | Not a native product | Load forecasting; limited renewables coverage | Wind and solar asset-level; no unified multi-country platform | Not a native product |
| Agent / natural-language workflow | Athena: briefings, benchmarks, backtests, widget generation; ~90 s per query | None | None | None | None |
Frequently Asked Questions
Trusting Physics-Constrained Outputs for Trading Decisions
EPT is a spatiotemporal transformer foundation model trained on observational physics. It learns the governing conservation laws of mass, momentum, and energy directly from data in a latent representation that is constrained by those laws at the architectural level. This structure creates a clear difference between EPT and a generic transformer applied naively to atmospheric data. A generic model can produce outputs that violate physical conservation laws. EPT avoids these violations by design.
Validation is external and transparent. EPT-2 is benchmarked against more than 10,000 real ground stations on open-source StationBench with no post-processing or station fine-tuning, and the results are published in peer-reviewed technical reports on arXiv (EPT-2: arXiv:2507.09703; EPT-1.5: arXiv:2410.15076). As the benchmark data shows, EPT-2 consistently outperforms HRES. The benchmark is reproducible by any evaluator on the Jua platform in under five minutes. Jua for Energy provides forecasts and analysis. Trading and dispatch decisions remain with the customer.
Running Jua for Energy Alongside ECMWF
Jua for Energy is designed to run alongside an ECMWF subscription, not replace it. Most serious customers keep their ECMWF feed. ECMWF AIFS, ECMWF’s own AI model, runs natively on the Jua platform in the same workspace as EPT-2 under a unified schema. Jua for Energy replaces everything around the ECMWF feed. It removes the in-house grib pipeline, the spreadsheet stitching, the manual benchmarking, and the morning-briefing analyst.
The 7–9 a.m. manual prep routine compresses into a single workspace, refreshed up to 24 times a day, where every model, including ECMWF HRES, ENS, AIFS, GFS, Aurora, and EPT, appears on the same screen with one API. The customer retains the raw ECMWF signal and gains the accuracy, cadence, and agent layer that the raw signal alone cannot provide.
How Jua for Energy Differs from Aurora or GraphCast Subscriptions
Aurora and GraphCast are research outputs from large companies’ AI labs. Jua is a foundation model and agent company, and Jua for Energy is a productised platform built on EPT and Athena where Aurora and GraphCast run as guests on the comparison surface. Five concrete differences follow from that distinction. First, EPT-2 forecasts at arbitrary lead times natively, while Aurora rolls forward in fixed 6-hour increments that compound error. Second, EPT-2e is a productised 30-member ensemble that beats the 50-member ECMWF ENS mean on RMSE and CRPS at virtually every lead time, and no AI peer ships an equivalent. Third, EPT-2 RR updates up to 24 times per day, while AI peers are typically updated four times per day. Fourth, Athena is an AI agent that turns natural-language questions into briefings, benchmarks, backtests, and custom widgets in approximately 90 seconds, and no AI weather peer has an equivalent. Fifth, the Jua platform is a 25-model benchmarking surface where Aurora, GraphCast, and AIFS all run, so the comparison is built in and reproducible by the customer at any time.
Integration Steps for Quant Teams
Integration starts with pip install jua. The Python SDK on PyPI provides forecast access, hindcast and backtesting, and weather-parameter standardisation across all 25+ models on the platform. The REST API exposes the same model surface with Apache Arrow support for large payloads. Hindcast data is available across multiple Jua and third-party models, which enables multi-year backtests. A backtest runs in approximately 5 minutes via Athena or directly through the SDK for teams that prefer programmatic access.
ENTSO-E grid-data integration is available for European power-market data. Documentation is at docs.jua.ai. The developer dashboard lives at developer.jua.ai. Integration that takes a quant team a quarter to build on a raw research subscription stands up in days on Jua for Energy.
Timeline for Running a Proof of Value
The live benchmark is the standard proof of value and takes under five minutes. A prospect selects a region and a variable that matters to their book, typically a wind-rich region of their home market or a solar-heavy portfolio, then selects their current provider alongside EPT-2. The platform returns a head-to-head accuracy comparison on the spot against more than 10,000 ground stations. The benchmark is available at athena.jua.ai without a sales call.
For teams that want a guided session, a demo covers the benchmark, the power-forecast surface, Athena’s agent capabilities, and the SDK in a single session. The objection shifts from “is this real?” to “how fast can we sign?” and physical trading houses and quant funds typically move within weeks.
Conclusion: Live Benchmarks and Portfolio Impact
The live benchmark at athena.jua.ai acts as the deal trigger. A prospect picks a region and a variable, runs the head-to-head comparison against their current provider, and the numbers speak. As the benchmark data shows, EPT-2 consistently outperforms HRES. EPT-2e beats the 50-member ECMWF ENS mean on RMSE and CRPS at virtually every lead time. EPT-2 RR updates up to 24 times per day at a cost roughly four orders of magnitude lower than a traditional NWP run. A 1 GW wind portfolio that gains four percentage points of forecast accuracy saves approximately €1.5 M per year. A 1 GW solar portfolio saves approximately €3 M per year.
Jua is a foundation model and agent company. Jua for Energy is the first applied product. The architecture learns physics. The domain is a variable. Energy is the first market and will not be the last.
High-intent: Book a demo to see EPT-2 head-to-head against your current forecast provider.
Mid-intent: Run benchmarks on your own region and variables on the Jua platform. See your forecasts in less than 5 minutes, head-to-head against 25+ models, at athena.jua.ai.
Developer: Pipe Jua forecasts into your own models. pip install jua to start, or read the API documentation at docs.jua.ai.