OCE Trading

Three Weeks in FX: A Graveyard of Good Ideas (and One Survivor)

March 27, 2026 · 5 min read

We spent three weeks exploring every major FX pair we could get our hands on. EUR/USD, GBP/USD, USD/JPY — the usual suspects. Built a feature engine, ran walk-forward validation across six folds, tested dozens of entry conditions.

The results were depressing.

Not because our signals were bad — they actually had mild predictive power across most pairs. The problem was simpler and more brutal: transaction costs eat everything.

On a 15-minute timeframe, a typical FX major gives you maybe 7 basis points of raw edge per trade. Sounds reasonable until you add up the round-trip cost: spread (3-5bp) plus slippage (2-3bp) plus the delay between signal and execution. You're left with approximately nothing. We ran the numbers six different ways, added filters, tweaked parameters, and kept arriving at the same place — breakeven, give or take a few basis points that could easily be noise.

Four pairs. Thirteen rules that passed walk-forward validation. Every single one of them died when we applied realistic cost assumptions.

Raw Edge vs Transaction Cost by Currency Pair

The chart tells the whole story. Three major pairs sitting at net zero. Gold sticking out like a sore thumb — in a good way.

Gold is a different animal

Then we looked at XAU/USD, and the math changed completely.

Gold moves. A lot. The daily range on gold is roughly 2-3x what you see on EUR/USD. That matters because our signals generate edge in proportion to volatility — if the underlying moves 14bp on average per signal instead of 7bp, suddenly you have room to pay the spread and still keep something for yourself.

Same signal framework, same timeframe, same validation method. Gold came back with edge that survived costs. Not by a slim margin either — the cost-adjusted returns were meaningfully positive across all six walk-forward folds and every single calendar year we tested (going back to 2019).

So we're running a paper trader on gold now, connected to a demo account via FIX protocol. Fifteen-minute bars, scanning at each bar close. We'll let it accumulate data for a couple of weeks before drawing any conclusions.

What didn't work (so you don't have to try it)

Here's where it gets educational. We tried several "improvements" that looked promising on paper and failed spectacularly in practice. Sharing these because they're common ideas that other systematic traders might waste time on.

1. Weighted Continuous Alignment (WCA)

The idea: instead of using sign-based discretization of returns across timeframes (which reduces continuous values to just +1/0/−1), use the raw return magnitudes. Weight them by z-score or percentile rank. More information should mean better signals, right?

Reality: hit rate dropped from 58.6% to 49%.

Every single continuous variant performed worse than a coin flip. The sign() function isn't losing information — it's filtering noise. A tiny positive return on the 4-hour chart and a massive positive return carry the same directional information. The magnitude just adds noise from random volatility fluctuations.

This one genuinely surprised us. The lesson: discretization can be a feature, not a bug. Sometimes less information is more.

2. Ehlers Early Onset Trend Detector

John Ehlers' work is respected for good reason. The Early Onset Trend Detector uses a cascade of filters — high-pass to remove drift, SuperSmoother to remove noise, Automatic Gain Control for normalization, and a Quotient transform to detect trend onset.

We implemented it with a grid search across three parameters: lookback period (48/96/192 bars), smoothing factor K (0.3/0.5/0.7), and threshold (0.3 to 0.9). Thirty-six parameter combinations total.

Best standalone result: profit factor of 1.80. Our baseline was 2.40. Not a single combination in the entire grid came close. The heatmap is a sea of orange and red — every cell below the baseline.

We also tested it as a leading indicator — does the Ehlers signal fire before our existing entry condition? About 46% of the time, yes, with a median lead of 8 bars (~2 hours). But when we built a "leading entry" strategy (enter on Ehlers, confirm with our signal), the trade count dropped so much that monthly returns fell despite better per-trade metrics.

The Ehlers detector is a legitimate tool. But adding it to a system that already works didn't improve things. More parameters, more complexity, same or worse outcome.

3. Sample Entropy as a regime filter

The theory: Sample Entropy (SampEn) measures the complexity/regularity of a time series. Low entropy = regular/trending, high entropy = noisy/random. Filter entries to only trade in low-entropy (trending) regimes.

No improvement whatsoever. The entropy calculation added computational cost and another parameter to tune, without any measurable benefit to entry quality. Trending regimes as defined by SampEn didn't correlate with our signal's success rate.

4. Session and volatility filters

This one was the most tempting trap.

The Filter Trap: Better PF ≠ Better Returns

Look at that chart. The blue line (monthly returns) goes down as the bars (profit factor) go up. Every filter that made the per-trade metrics look better also killed the trade count:

London/NY session only: Profit factor jumped from 2.40 to 3.47. Monthly returns dropped from 13.4% to 8.2% because you're cutting out 40% of your trades.
High ATR regime filter: Same story. Better per-trade quality, fewer trades, lower total returns.
Day-of-week filters: Monday and Friday avoidance showed slight improvement, but not enough to justify the reduced sample size.

We kept optimizing ourselves into fewer and fewer trades with better and better metrics that generated less and less actual money. It's the precision-vs-recall tradeoff applied to trading. A system that takes 20 perfect trades a month can easily underperform one that takes 50 decent trades.

Where we go from here

Gold paper trading is live. We'll watch it for a few weeks, compare live fills against backtest assumptions, and see if the edge holds when real market microstructure gets involved.

Meanwhile, we haven't given up on finding stable edges in other currency pairs. The cost problem isn't unsolvable — it just means we need to either find stronger signals or move to longer timeframes where the cost-to-edge ratio improves. Both paths are on the table.

The honest takeaway from three weeks of FX exploration: most of the "improvements" we tried made things worse, the simplest version of the signal was the best one, and the only real alpha came from picking the right instrument (gold) rather than engineering a more sophisticated entry.

Sometimes the market rewards you for knowing what not to trade more than for knowing how to trade it.

This is not investment advice. Past results don't guarantee future performance. All analysis reflects our internal research process only.