This is the page you bookmark. Every working ONNX session in MQL5 follows the same four-step lifecycle: load → configure → execute → release. The functions, their flags, and their failure modes are documented here as one coherent reference, with code that compiles against Build 5572.

The four-step lifecycle

The MQL5 ONNX API mirrors the underlying ONNX Runtime session pattern. You create a session from a model file (or embedded buffer), tell it the exact shape of inputs and outputs, run it against real data, then release the resources. Every working EA does these four steps in order:

the lifecycle
// 1. Load the model into a session handle = OnnxCreate(...) or OnnxCreateFromBuffer(...) // 2. Tell the session your input/output shapes (mandatory if any dim is dynamic) OnnxSetInputShape(handle, index, shape) OnnxSetOutputShape(handle, index, shape) // 3. Run inference OnnxRun(handle, flags, input, output) // 4. Release OnnxRelease(handle)

Skip step 2 and OnnxRun will fail with an error about undefined input shape. Skip step 4 and you leak GPU memory until MT5 restarts. Every step matters.

Step 1 — OnnxCreate / OnnxCreateFromBuffer

You have two ways to load a model: from a file on disk, or from a byte buffer embedded in the EA itself. Both return a session handle of type long, or INVALID_HANDLE on failure.

From a file

OnnxCreate signature
long OnnxCreate( string filename, // path relative to \MQL5\Files\ ulong flags // ONNX_DEFAULT, ONNX_USE_CPU_ONLY, ONNX_GPU_DEVICE_N, etc. );

Two things about the filename to know:

From a buffer (embedded as a resource)

For production EAs, embed the model inside the compiled .ex5 file using the #resource directive. This produces a self-contained binary that doesn't depend on filesystem state on the deploy target:

embedding the model
// At the top of your EA, alongside other includes: #resource "models\eurusd.H1.120.onnx" as uchar ExtModel[] // Then in OnInit(): long ExtHandle = INVALID_HANDLE; int OnInit() { ExtHandle = OnnxCreateFromBuffer(ExtModel, ONNX_DEFAULT); if(ExtHandle == INVALID_HANDLE) { Print("OnnxCreateFromBuffer failed, error ", GetLastError()); return(INIT_FAILED); } return(INIT_SUCCEEDED); }

The 1 GB resource size limit raised in Build 5572 means you can embed transformers, large LSTMs, anything that fits in a gigabyte. The trade-off is iteration speed: any change to the .onnx file requires recompiling the EA. For active development, load from disk. For production, embed.

The flags argument

The second parameter to both functions is a bitmask of flags. The common combinations:

CombinationWhat it does
ONNX_DEFAULTUse GPU if available; fall back to CPU. The right default.
ONNX_USE_CPU_ONLYForce CPU. Use when your GPU is below Turing or when CPU is faster for your model.
ONNX_GPU_DEVICE_0Pin to GPU 0 (multi-GPU systems). _1..._7 for other devices.
ONNX_USE_CPU_ONLY | ONNX_LOGLEVEL_VERBOSECPU only, with full debug logging. Use during integration to see what the runtime is doing.
ONNX_ENABLE_PROFILINGDumps execution-provider trace to JSON. Combine with any of the above.

For the complete flag reference and what changed in Build 5572, see the Build 5572 explainer.

Step 2 — OnnxSetInputShape / OnnxSetOutputShape

When you exported the model from Python, you almost certainly declared some dimensions as dynamic — typically the batch dimension. ONNX stores this as a symbolic name ("batch") rather than a number. MQL5's runtime needs that number resolved before OnnxRun can execute. This is the step everyone forgets.

setting shapes after OnnxCreate
// Input: batch=1, sequence=120, features=1 const long input_shape[] = {1, 120, 1}; if(!OnnxSetInputShape(ExtHandle, 0, input_shape)) { Print("OnnxSetInputShape failed, error ", GetLastError()); return(INIT_FAILED); } // Output: batch=1, predictions=1 const long output_shape[] = {1, 1}; if(!OnnxSetOutputShape(ExtHandle, 0, output_shape)) { Print("OnnxSetOutputShape failed, error ", GetLastError()); return(INIT_FAILED); }

The signature for both is the same:

If you don't know your model's exact shape, open the .onnx file in Netron — the visualization tool shows every input and output with its declared shape. Dynamic dimensions appear with their symbolic name; concrete dimensions appear as numbers.

Common mistake

If your model was trained with batch_first=True on a PyTorch LSTM, the input shape is (batch, seq, features). If you set it to (seq, batch, features) in MQL5, the runtime accepts it but the values are misaligned — the model predicts garbage. Match the Python order exactly.

Step 3 — OnnxRun

The actual inference. You pass the session handle, runtime flags, input data, and an output container for the result:

OnnxRun in action
void OnTick() { // Build the input data — same features and normalization as training matrixf input_data(1, 120); // float32 matrix, shape (1, 120) vectorf output_data(1); // float32 vector, length 1 // ... fill input_data with the model's expected features ... if(!OnnxRun(ExtHandle, ONNX_NO_CONVERSION, input_data, output_data)) { Print("OnnxRun failed, error ", GetLastError()); return; } double prediction = output_data[0]; // ... act on the prediction ... }

Type matching matters

Your model was exported with a specific input dtype, usually float32. MQL5 matches that with matrixf (float32 matrix) or vectorf (float32 vector). If you pass a double-based matrix to a float32 model, MQL5 silently does the conversion — unless you set the ONNX_NO_CONVERSION flag.

The flag tells the runtime "I promise the types match; skip the check." When they actually match, it's a few percent faster on the hot path. When they don't, you get an immediate error instead of silent miscalculation, which is usually what you want.

OnnxRun runtime flags

The flags passed to OnnxRun are per-call and override session-level flags from OnnxCreate. Common uses:

Step 4 — OnnxRelease

Every session you create must be released. Call OnnxRelease in OnDeinit, or any time the EA is shutting down a model session:

cleanup
void OnDeinit(const int reason) { if(ExtHandle != INVALID_HANDLE) { OnnxRelease(ExtHandle); ExtHandle = INVALID_HANDLE; } }

If you forget this, the model stays loaded in GPU (or CPU) memory until MT5 itself shuts down. For a single session per EA, the leak is bounded. For an EA that creates sessions in a loop — don't ask why anyone would, but it happens — you'll exhaust GPU memory in minutes.

Full working EA template

Putting it all together — an EA that loads an embedded model, sets shapes, runs inference on each new bar, and cleanly releases on shutdown:

EURUSD_onnx_ea.mq5 — complete template
#include <Trade\Trade.mqh> #resource "models\eurusd.H1.120.onnx" as uchar ExtModel[] #define SAMPLE_SIZE 120 long ExtHandle = INVALID_HANDLE; CTrade trade; int OnInit() { // 1. Load embedded model. ONNX_DEFAULT = use GPU if available. ExtHandle = OnnxCreateFromBuffer(ExtModel, ONNX_DEFAULT); if(ExtHandle == INVALID_HANDLE) { Print("OnnxCreateFromBuffer failed, error ", GetLastError()); return(INIT_FAILED); } // 2. Set input shape (batch=1, seq=120, features=1) const long in_shape[] = {1, SAMPLE_SIZE, 1}; if(!OnnxSetInputShape(ExtHandle, 0, in_shape)) { Print("OnnxSetInputShape failed, ", GetLastError()); return(INIT_FAILED); } const long out_shape[] = {1, 1}; if(!OnnxSetOutputShape(ExtHandle, 0, out_shape)) { Print("OnnxSetOutputShape failed, ", GetLastError()); return(INIT_FAILED); } return(INIT_SUCCEEDED); } void OnTick() { // Only run on a new bar (skip ticks within the same bar) static datetime last_bar = 0; datetime current_bar = iTime(NULL, PERIOD_CURRENT, 0); if(current_bar == last_bar) return; last_bar = current_bar; // 3. Build input — last 120 closes, normalized matrixf input(1, SAMPLE_SIZE); vectorf output(1); double closes[SAMPLE_SIZE]; if(CopyClose(NULL, PERIOD_CURRENT, 1, SAMPLE_SIZE, closes) != SAMPLE_SIZE) return; // ... apply same normalization used in training ... for(int i = 0; i < SAMPLE_SIZE; i++) input[0][i] = (float)closes[i]; // 4. Run inference if(!OnnxRun(ExtHandle, ONNX_NO_CONVERSION, input, output)) { Print("OnnxRun failed, ", GetLastError()); return; } double prediction = output[0]; // ... act on prediction (buy/sell/skip) ... } void OnDeinit(const int reason) { if(ExtHandle != INVALID_HANDLE) OnnxRelease(ExtHandle); }

This template compiles against Build 5572. Drop your .onnx file into MQL5\Files\models\, set SAMPLE_SIZE to match your model's sequence length, and you have a working ONNX-backed EA.

When something goes wrong

Common failures by step:

SymptomCauseFix
OnnxCreate returns INVALID_HANDLE, error 5800File not found or invalid ONNXERROR 5800 guide
CUBLAS_STATUS_ARCH_MISMATCH in logGPU below TuringCUBLAS error fix
OnnxSetInputShape returns falseShape array wrong length, or session is invalidVerify shape with Netron; check handle is valid
OnnxRun returns false silentlyType mismatch, or NaN in inputCheck input dtype matches model; check for NaN/Inf in features
Inference returns wrong valuesNormalization mismatch between training and runtimeNormalization fix
Runs on CPU even with ONNX_DEFAULTSilent CUDA fallbackVerify GPU is being used
where to test it

You'll want a GPU somewhere in the pipeline.

The EA above runs on CPU just fine for most retail-sized models. If your model is larger and needs CUDA — or you're training before you ship — rent a CUDA-compatible GPU by the hour:

Affiliate disclosure: some links above are affiliate links. We compare them in the cloud guide.