You created your ONNX session with ONNX_DEFAULT — meaning "use GPU if available." The EA loads without errors. Inference runs. The model produces predictions. Everything looks fine.

But how do you know the GPU is actually being used?

The default behavior of ONNX Runtime is to silently fall back to CPU when the CUDA execution provider can't handle something — an unsupported operator, a memory constraint, a missing kernel. The EA keeps running, just at CPU speed. Unless you check, you might be paying for GPU hardware that does nothing.

Three methods, from fastest to most thorough.

Method 1: nvidia-smi while EA runs (10 seconds)

The crudest but fastest check: while your EA is running, open a command prompt and run:

Command Prompt
> nvidia-smi +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | |=============================================================================| | 0 N/A N/A 4892 C+G terminal64.exe 340 MiB | +-----------------------------------------------------------------------------+

If terminal64.exe appears in the process list with non-zero GPU memory, MT5 has loaded something onto the GPU. That's a strong signal — but not definitive. The session is initialized on GPU; individual OnnxRun calls could still fall back.

Better: watch in real time

Run nvidia-smi -l 1 — updates every second. Then trigger an inference and watch the GPU utilization spike. If utilization stays at 0% while inferences happen, CPU fallback is occurring.

Method 2: verbose logging (one line of code)

Add ONNX_LOGLEVEL_VERBOSE to the flags you pass to OnnxCreate:

force ONNX to log its decisions
ExtHandle = OnnxCreateFromBuffer( ExtModel, ONNX_DEFAULT | ONNX_LOGLEVEL_VERBOSE );

On session creation, the Experts log will fill with the runtime's internal decisions. Two lines to look for:

good outcome
[INFO] Adding execution provider: CUDAExecutionProvider [INFO] CUDA device 0 selected: NVIDIA GeForce RTX 4070
bad outcome
[WARNING] CUDAExecutionProvider failed to initialize [INFO] Adding default CPU execution provider

If you see the second pattern, the GPU initialization failed — check the lines above for the underlying reason (usually GPU compute capability too low, or driver issue).

Important: remove ONNX_LOGLEVEL_VERBOSE for production. It writes hundreds of lines per session. Keep it only during integration.

Method 3: profiling JSON (definitive)

The verbose log tells you which execution provider was added. The profiling JSON tells you which execution provider actually ran each node. This is the only way to confirm that not just session-level but per-node execution happened on GPU.

enable profiling
ExtHandle = OnnxCreateFromBuffer( ExtModel, ONNX_DEFAULT | ONNX_ENABLE_PROFILING );

Run the EA for a minute, then stop it. The runtime writes a JSON to MQL5\Files\OnnxProfileReports\<EA name>_<date>_<time>.json. Open it in any text editor and search for "args":{"provider":. Each occurrence is a node execution — tagged with either "CUDAExecutionProvider" or "CPUExecutionProvider".

If all nodes are CUDAExecutionProvider: clean GPU execution. Optimal.

If most are CUDA, some are CPU: partial fallback, with the Memcpy overhead we covered in the Memcpy nodes article. Often acceptable.

If all are CPU: total fallback. The GPU isn't being used. Diagnose below.

If CPU fallback is happening, why?

Three causes, in order of frequency:

  1. Your GPU is below Turing (compute 7.5). The most common cause. Diagnosis: nvidia-smi --query-gpu=name,compute_cap --format=csv. Fix: CUBLAS error guide.
  2. You set ONNX_USE_CPU_ONLY explicitly somewhere. Maybe in older code you copied. Search your .mq5 for the flag — remove it from OnnxCreate and OnnxRun if present.
  3. The model has operators the CUDA execution provider doesn't support. The runtime falls back gracefully but completely if it can't get a clean GPU graph. Verbose logging will show the unsupported op. Fix: re-export with simpler ops, or use onnx-simplifier.