A "market-structure classifier" sorts each bar into one of three states: uptrend, downtrend, or range. It doesn't predict price; it labels the current regime. Used as a filter on top of an existing rule-based EA, it can suppress trades in conditions where the EA historically loses. This article walks through the design, the labels, the features, the export, and the MQL5 integration.
What's in this article
Why a filter, not a forecaster
Price forecasting is hard (efficient markets, low signal-to-noise). Regime classification is easier because the labels are coarser and more stable. Pairing a known-edge rule-based EA with an ML-driven "don't trade when the model says range" filter is usually a higher-payoff use of ML than trying to predict price directly.
How to label market structure for training
For each bar in your training history, generate a label using a rule you'd never want the EA to encode (because the rule looks into the future):
The lookahead and threshold are knobs. Tune by checking the class balance — with sensible settings you want roughly 1/3 of bars in each class.
Feature engineering
The features that matter for regime classification (not price forecasting):
- Returns over multiple horizons: last 1, 5, 10, 20, 60 bars.
- Realized volatility: rolling std of returns over 5, 20, 60 bars.
- Momentum oscillators: RSI(14), Stochastic, etc. — ML can use them as features even if it then ignores their rule-based signals.
- Range proxies: high-low spread normalized by ATR.
Roughly 15–25 engineered features is enough. More than that and the model overfits without enough data.
Training a 3-class classifier
LightGBM is the natural fit — tree-based, robust, fast, and exports cleanly to ONNX:
See LightGBM to ONNX for the full export workflow.
Using it as a filter in MQL5
Run the same EA with and without the filter on the same period in the Strategy Tester. The filter should reduce trade count and improve win-rate / drawdown — that's the whole point. If it doesn't, either the labels aren't predictive or the features aren't matched between training and runtime (see normalization).