On January 30, 2026, MetaQuotes shipped MetaTrader 5 Build 5572 — the first stable release that exposes CUDA-accelerated ONNX inference directly inside MQL5. For developers running PyTorch or Keras models against tick data in an Expert Advisor, this is the most consequential platform update since native ONNX support landed in 2023.
But the release notes leave a lot unsaid. Which GPUs actually work. Which flags were renamed (or quietly removed). Where the profiling JSON lands. What happens when OnnxRun silently falls back to CPU. This article walks through every change in Build 5572 that touches the ONNX subsystem, with the gotchas the notes don't mention.
What's in this article
The headline change, in one paragraph
Before Build 5572, every OnnxRun call was a CPU operation. The runtime existed inside the MT5 process, but it had no path to the GPU. As of January 30, 2026, MQL5 ships with a CUDA-enabled ONNX Runtime binary, picks up your NVIDIA driver, and routes the compatible nodes of your graph to the GPU automatically. For a transformer-sized model or a long-context LSTM, the speedup is roughly an order of magnitude. For a small gradient-boosted tree, the speedup is approximately nothing — the data-transfer overhead eats the gain. The point isn't that GPU is always faster; the point is that you finally have the choice.
Flags that were renamed, added, removed
The ENUM_ONNX_FLAGS enum was reshuffled in Build 5572. If you copy a snippet from an article written before January 2026, it may not compile. Here's the full delta:
Renamed
- Force-CPU flag. The old
ONNX_CUDA_DISABLEis gone. The replacement isONNX_USE_CPU_ONLY. Same semantics — pass it toOnnxCreate,OnnxCreateFromBufferorOnnxRunto skip the GPU — but the name finally reflects the intent. - Debug logging.
ONNX_DEBUG_LOGSis deprecated. Replace it with one of three new levels:ONNX_LOGLEVEL_VERBOSE(every line),ONNX_LOGLEVEL_WARNING(warnings + errors), orONNX_LOGLEVEL_ERROR(errors only). The verbose level is what you want during a fresh integration; the warning level is what you want on a live EA.
Added
- GPU device selection. Eight new flags:
ONNX_GPU_DEVICE_0throughONNX_GPU_DEVICE_7. On a multi-GPU rig, you tell MQL5 exactly which card to load the session on. The behavior when multiple device flags are passed at once: the runtime uses the lowest index. If you pass an index that doesn't exist on the system (you asked for_3on a single-GPU machine), the runtime selects a device for you instead of failing. - Profiling.
ONNX_ENABLE_PROFILING— the single most useful new flag for debugging. It dumps a JSON trace of the session to/MQL5/Files/OnnxProfileReports/. We cover what to read in that JSON in a dedicated guide.
Unchanged but worth re-reading
ONNX_NO_CONVERSIONstill skips MQL5's automatic input-type conversion. If your model expectsfloat32and you pass an MQL5matrixf, the conversion is a no-op anyway — but with the flag set, the runtime won't even check, which is measurably faster on hot paths.ONNX_DEFAULTremains "use whatever's available, prefer GPU." This is what you want 90% of the time.
Which GPUs work — and which throw on load
This is the section that catches people. Build 5572's CUDA backend is compiled against a specific minimum compute capability, and any card below that floor will fail to load the ONNX session with a hard error.
The floor is the Turing architecture. That means:
- Works: GTX 1660 (Turing), the entire RTX 20-series, RTX 30-series, RTX 40-series, RTX 50-series, Quadro RTX, Tesla T4, plus the data-centre A100/H100/H200.
- Doesn't work: any Pascal card (GTX 1070, 1080, Titan X-Pascal, Tesla P40, Quadro P-series), any Maxwell card (GTX 970/980, Quadro M-series, Tesla M40), any Kepler card, and obviously anything older than that.
If your card is below Turing, the session creation throws a very specific error: CUBLAS failure 8: CUBLAS_STATUS_ARCH_MISMATCH. The error message itself reads cryptically, but it always means the same thing: your GPU's compute capability is below 7.5 (Turing) and the cuBLAS shipped with MT5 refuses to initialize. We documented the full diagnosis and fix in a dedicated error page.
Heads up
"Has an NVIDIA driver installed" is not enough. The driver can be current, the card can show up in nvidia-smi, and the ONNX session can still fail with the architecture error. The constraint is on the silicon, not the driver.
The new profiling JSON
Adding ONNX_ENABLE_PROFILING to the flags you pass to OnnxCreate turns on session-level profiling. When the EA stops (or you call OnnxRelease), a JSON file is written to /MQL5/Files/OnnxProfileReports/. The filename pattern is <EA name>_<date>_<time>.json — one report per session.
The file is structured as an array of events, in the same format used by Chrome's chrome://tracing. Each node in your graph gets a row showing the execution provider it ran on (CUDAExecutionProvider or CPUExecutionProvider), the time it took, and where it sits in the timeline. Two practical uses:
- Confirming the GPU is actually being used. If you wanted CUDA but every row says
CPUExecutionProvider, the runtime fell back. Common cause: an unsupported op in the model graph. - Finding the bottleneck. Long rows are slow nodes. If a single op is eating 60% of inference time, it's usually a candidate for replacement (e.g. an LSTM cell that could be a GRU, or a layer that could be folded into its neighbor at export time).
We walk through reading the JSON line-by-line in the profiling guide.
The 1 GB embedded-model limit
MQL5 lets you embed an .onnx file as a resource of the compiled EA, using the #resource directive. In Build 5572, the per-resource size cap rose to 1 GB. For practical purposes, that ceiling is gone — a 1 GB model is enormous for any trading workload, large enough to embed BERT-scale transformers without external file management.
The implication is workflow rather than capability: you can now ship a self-contained .ex5 compiled binary that includes the entire model, with no separate file to copy to /MQL5/Files/ on the production VPS. This matters for prop-firm setups where you may not have full filesystem access on the broker's terminal.
One catch
If you use #resource, you have to recompile the EA every time the model changes. For active development, loading the model from /MQL5/Files/ at runtime is more iterative. For production, embed it.
Why your first run is slow now
Up to Build 5571, the ONNX Runtime DLL shipped inside the MT5 installer. Starting with Build 5572, the runtime is fetched on demand — the first time an EA on the machine calls any ONNX function, the platform downloads the appropriate runtime binary. The advantage is that MetaQuotes can push runtime updates (security fixes, new operator support) without releasing a full platform update.
The practical consequence: your first OnnxCreate on a fresh MT5 install may stall for 5–20 seconds while the runtime downloads. After that, the binary is cached and subsequent calls are instant. If you're deploying to a new VPS, run a smoke-test EA once before the live one starts — otherwise the live EA's OnInit() may time out on the first tick.
What to change in your code today
If you have an existing ONNX-using EA from 2024 or 2025, here's the minimum migration:
- Replace
ONNX_CUDA_DISABLEwithONNX_USE_CPU_ONLYif you were using it. Code that still references the old name will not compile in 5572. - Replace
ONNX_DEBUG_LOGSwithONNX_LOGLEVEL_*at the desired verbosity. Production EAs should useONNX_LOGLEVEL_WARNINGto keep the Experts log clean. - Decide your default: GPU or CPU. If your model is small (gradient-boosted trees, small MLPs), the CPU is faster after warm-up — add
ONNX_USE_CPU_ONLYand stop worrying about GPU compatibility. If your model has any sequence layers (LSTM, GRU, transformer), benchmark both withONNX_ENABLE_PROFILINGbefore deciding. - On a multi-GPU box, pin the device explicitly.
ONNX_GPU_DEVICE_0beats relying on the runtime's auto-pick — it makes the choice reproducible across restarts.
For the full working pattern — OnnxCreateFromBuffer, OnnxSetInputShape, OnnxRun, OnnxRelease — refer to the API cheat-sheet, updated for Build 5572.
You can't add a GPU to a forex VPS.
Forex VPS providers (QuantVPS, ForexVPS, NYCServers) sell network proximity to broker matching engines — they're CPU-only by design. To run a CUDA-enabled ONNX session, you need a local NVIDIA workstation or a GPU cloud instance. We compare four cloud providers honestly:
Affiliate disclosure: some links above are affiliate links. We earn a commission if you sign up — recommendations are independent.
The bottom line
Build 5572 doesn't change what you can do in MQL5 with ONNX — you could load and run models before. It changes how fast and at what model size. Architectures that were impractical on CPU (a few hundred milliseconds per inference) are now sub-10 ms on a Turing-or-newer GPU. The cost is the new constraint: your hardware has to be 2018-or-newer NVIDIA, your code has to use the new flag names, and your first run on a fresh machine will stall while the runtime downloads.
For anyone shipping a serious ML-backed Expert Advisor in 2026, this build is the new baseline.