I’d start by checking whether the strategy is actually robust to execution differences, not just to the bar model.
For XAUUSD systems, the main failure points are spread expansion, session-specific volatility, and entry logic that depends on candle timing. A useful test matrix is: same logic on different spread regimes, fixed lot vs. risk-based sizing, and a truly untouched out-of-sample window.
If the edge disappears only on real-tick data, the setup may be too sensitive to the market open, news spikes, or thin-liquidity hours.