If your workstation or cloud instance has two or more NVIDIA GPUs, Build 5572 lets you pick exactly which one runs each ONNX session. Eight new flags — ONNX_GPU_DEVICE_0 through ONNX_GPU_DEVICE_7 — cover up to an 8-GPU rig. This article covers the two practical use cases for them: pinning for reproducibility, and load-distributing across multiple EAs.
The default behavior (no flag)
If you pass ONNX_DEFAULT on a multi-GPU system, the runtime picks a device for you — usually device 0, but not guaranteed. On a single-GPU box this is fine. On a multi-GPU box, two problems appear:
- Reproducibility: different EA instances may pick different devices, making debugging confusing.
- Resource contention: if everything piles on device 0, devices 1–7 sit idle while device 0 thermal-throttles.
Pinning a session to a specific device
Verify in the Experts log with ONNX_LOGLEVEL_VERBOSE — you'll see CUDA device 1 selected: <card name>. Or check with nvidia-smi while the EA is running — the terminal64.exe process should show GPU memory usage on device 1.
Use case 1: reproducibility on dual-GPU dev box
You have one workstation with two RTX 4070s. When you're debugging an EA's inference behavior, you want the session to land on the same physical card every time, so timing and memory profiles are comparable across runs.
Now every run uses the same card. If you upgrade one card to an RTX 4090, you can flip DEV_GPU to ONNX_GPU_DEVICE_1 and ride the faster one.
Use case 2: load-distribute across multiple EAs on one host
You run four EAs simultaneously on a dual-GPU server. Two should land on device 0, two on device 1, to avoid memory contention and to use both cards.
Set InpGpuDevice in each EA's input parameters at attach time. EAs 1–2 use device 0, EAs 3–4 use device 1.
Edge cases and rules
- Non-existent device:
ONNX_GPU_DEVICE_5on a single-GPU box doesn't error. The runtime auto-picks a device. Safe to leave in code. - Multiple device flags passed:
ONNX_GPU_DEVICE_0 | ONNX_GPU_DEVICE_3resolves to the lowest — device 0. Don't OR them deliberately. - Combined with
ONNX_USE_CPU_ONLY: CPU wins. The device pin is ignored. - Device numbering matches
nvidia-smi. Ifnvidia-smilists card "RTX 4090" as device 1,ONNX_GPU_DEVICE_1in MQL5 selects that exact card.
One EA, one device — not two
The flags pin a session to a device. They do not split a single model across multiple GPUs (that's "model parallelism," and ONNX Runtime doesn't expose it through MQL5). Each EA gets one GPU. Use multiple GPUs to run multiple EAs in parallel, not to accelerate a single one.
If you genuinely need a model that's too big for one GPU, the answer is a bigger GPU — rent an A100 or H100 hourly. See the GPU cloud guide.