Fix: LSTM ONNX export fails or silently produces wrong outputs

You exported a PyTorch model with an nn.LSTM inside and one of two bad things happened:

Mode A — loud failure: torch.onnx.export threw a RuntimeError about control flow or scripting.
Mode B — silent failure: The export succeeded, the .onnx file exists, but when you run inference in onnxruntime the outputs diverge from PyTorch's outputs by a lot (not numerical noise — real differences).

Both modes have the same root cause and the same fix.

The root cause

PyTorch's LSTM module uses internal control flow to iterate over the time dimension. When you ask ONNX export to handle a dynamic sequence length, the exporter has to capture that loop as a runtime Loop op. The capture uses torch.jit.script, which struggles with LSTM internals — sometimes failing, sometimes "succeeding" with a graph that has subtly wrong outputs.

The fix: static seq_len at export

working LSTM export

dummy = torch.randn(1, 120, 1) # seq=120 baked in torch.onnx.export( model, dummy, "model.onnx", opset_version=17, input_names=["input"], output_names=["output"], dynamic_axes={ # ONLY batch dynamic "input": {0: "batch"}, "output": {0: "batch"}, }, do_constant_folding=True, )

The full guide with all the patterns is at PyTorch LSTM to ONNX.

Validating you fixed it

Don't ship to MT5 without checking PyTorch's output matches ONNX's output:

validation

import numpy as np import torch, onnxruntime as ort model.eval() x = torch.randn(1, 120, 1) y_torch = model(x).detach().numpy() sess = ort.InferenceSession("model.onnx") y_onnx = sess.run(None, {"input": x.numpy()})[0] assert np.allclose(y_torch, y_onnx, atol=1e-4), "LSTM still broken!"

If the assert passes (max-diff under 1e-4), the export is correct. If it fails, the static-seq_len pattern wasn't applied correctly.

If you need variable seq_len at inference

Three options, none of them ideal:

Pad to maximum length + train with masking. Most common pattern.
Export multiple models, one per length, load all in MQL5.
Try dynamo=True — the newer exporter. May work, may not, always validate.

For 95% of MT5 use cases, the lookback window is constant by design. Pick a number and freeze it.

// related

full guide

PyTorch LSTM to ONNX

torch.jit.script failed

reference

dynamic_axes explained

downstream

OnnxCreate & OnnxRun