We Built a Trading Model. It Made Nearly 6% in a Week. Then We Shut It Down.

Finding standalone alpha in liquid intraday markets is one of the most over-mined problems in finance, and the discipline that matters is not generating signals but deciding which ones to believe. Valan Technologies built a small systematic intraday system to understand that problem first-hand, ran every candidate through a framework designed to kill its own ideas, and recorded the result. The system returned +5.95% in its first week of live paper trading and was shut down anyway — not because it lost money, but because the framework had already shown that the profit was statistically indistinguishable from luck. This is a field note on that framework, and on why a visible profit should never overrule measurable evidence.

Key Facts

The trading system returned +5.95% on its book in its first week of live paper trading.
The week comprised 13 trades, of which 62% (8 of 13) were winners, with a median trade of +$49.75.
A coin flip with no edge produces 8 or more winners out of 13 roughly 29% of the time — about one week in three.
A single QQQ trade contributed $377.70 of $594.83 gross profit, or 36% of the week's total.
The strategy-selection framework pre-registered and recorded 182 signal auditions, win or lose.
One in-house momentum system appeared to return ~30% per year but had a true market-adjusted alpha of t = −0.56, statistically indistinguishable from zero.
A short-the-failed-gap strategy showed +62.7 basis points per trade gross, but 58% of its events came from the 2020 crash and the remainder lived in untradeable small-cap names.
Of 14 published "alpha" formulas tested, 13 were killed by transaction costs alone.
The strongest surviving signal was fully market-neutral and placebo-clean, yet still lost money by less than half a basis point to the cost of crossing the bid–ask spread.
No real money was ever at risk; the system ran on a paper account only.

Alpha in liquid intraday markets is one of the most picked-over fields in all of finance, mined by firms with more data, faster execution, and larger research teams than anyone reading this. So the question that decides whether you survive there is not can you find a signal — signals are easy and mostly false — but can you tell the difference between an edge and a coin landing heads. That discipline, not any single strategy, is the subject of this piece. To learn the problem from the inside, Valan Technologies built a small systematic intraday system and ran every candidate signal through a framework built to kill its own ideas. This is the report from that exercise.

There is a version of the story that flatters us, and it is not true. In that version a model loses money and a disciplined team cuts it. Tidy. Wrong. Here is what actually happened: in its first week of live paper trading, the system returned +5.95% on its book across thirteen trades, 62% of them winners, with a median trade of +$49.75. By any honest reading, that is a good week. The team shut it down anyway — not despite the good week, but because of what a good week is actually worth.

Finding 1: A +6% week is not statistical evidence

A 62% win rate over thirteen trades is eight wins and five losses, and a coin flip with no edge whatsoever clears that bar roughly 29% of the time — about one week in three. A sample that small cannot be distinguished from luck because there is nothing in it to distinguish.

Concentration compounds the problem. A single QQQ position contributed $377.70 of the week's $594.83 gross profit, meaning 36% of all profit came from one trade. Removing the single luckiest fill halves the week. Profit concentrated in one position is the fingerprint of variance, not edge.

The decision of whether a strategy should keep running is settled by research — years of data run through tests designed to kill the idea — not by a single week of live trades in either direction. In this case the research had already returned its verdict.

Finding 2: The selection standard was a gauntlet built to kill its own ideas

Every candidate signal was run through a deliberately severe set of gates, and a signal earned a place only by surviving all of them. This framework, rather than any single strategy, is the durable asset.

The spread-bounce test catches signals that manufacture fake profit from the bid–ask spread. Measuring a signal at a bar's closing price and entering at that same close fabricates an edge out of the spread itself; the fix is to enter at the next bar's open and A/B test it against the naive version. This test alone killed roughly half a dozen mean-reversion signals.

The beta-strip regresses the market out of every result, because a strategy that shines in a rising market is often just leverage on the market in disguise.

The cost-and-turnover gate measures all performance net of realistic transaction cost and computes each signal's break-even spread in basis points — the cost at which its edge reaches exactly zero. A signal that trades 250 times a year must clear that hurdle 250 times.

The luck haircut applies a deflated Sharpe ratio that penalises the number of ideas tried, a placebo test that shuffles the signal to check whether the edge survives randomisation, and bootstrapped confidence intervals.

Honest time and universe construction uses calendar-time portfolios so overlapping events cannot inflate a t-statistic, expanding-window embargoed walk-forward testing so no result is quoted on data it was fitted to, and a broad survivorship-honest universe rather than a curated list of surviving names.

A permanent register recorded all 182 auditions, win or lose, so that no idea could be quietly re-run until it passed by chance.

The architecture enforced the same discipline in code through a strictly event-driven pipeline: a scanner proposes candidates, a strategy emits a signal, a separate module sizes it, and a separate module executes it, with a statically enforced rule that a strategy can never size or place its own orders. The system was point-in-time strict throughout, so no signal could ever see data from its own future.

Finding 3: Most candidate strategies died, and the failures were instructive

The entire intraday mean-reversion family failed under the spread-bounce test, where almost all of it turned out to be the spread itself. The behavioural well is dry on liquid large-caps at five-minute frequency.

t = −0.56

True market-adjusted alpha of an in-house momentum system that appeared to return ~30% per year

De-clustered, run on a broad universe, and market-adjusted, its edge was statistically indistinguishable from zero. The 30% was entirely leveraged bull-market beta. This pattern recurred so often that the beta-strip became the most valuable test in the framework.

A short-the-failed-gap strategy showed a headline +62.7 basis points per trade, but the post-mortem found that part of it was a windowing bug, 58% of the events came from the 2020 crash alone, and the remainder lived entirely in small, hard-to-borrow, wide-spread names that cannot actually be traded. The headline was gross; the reality was untradeable net.

Of fourteen published "alpha" formulas tested, thirteen were killed not by being wrong but by trading so often that transaction cost consumed the entire edge.

Even findings that were real refused to become money. A volatility term-structure measure cleanly sorts how breakouts behave across two dozen stress episodes — genuine, orthogonal information — but is not a standalone tradeable edge. And shorting breakdowns in a downtrend, the apparently obvious trade, was a measured structural loser because the position gets squeezed; the data inverted the naive intuition.

Finding 4: The signals that survived still were not tradeable

A few signals were genuinely real, including a volatility-term-structure fade and a low-volatility factor. The strongest one passed every test — high significance, placebo-clean, and fully market-neutral — and still lost money by less than half a basis point to the cost of crossing the spread. The low-volatility factor cleared cost, but only at daily frequency, which is not day-trading.

The finding reduces to one line: the signals were reliably either statistically real or cheap enough to trade, but almost never both at once.

Finding 5: The constraints that killed the strategy are structural, not solvable with more cleverness

Cost is the binding constraint, not signal discovery. The team kept finding real edges and watching them die at the spread. Discovery was never the bottleneck; affordability was.

Behavioural edge has a pulse only in high-energy events such as panics, shocks, and climaxes, and flatlines in quiet markets. Where it is real, it lives in rowdier, less-watched corners rather than the most-covered names where everyone is already looking.

That last point is the one worth sitting with. Price, on the most liquid names, is the single most strip-mined dataset on earth — every edge in it has been found, costed, and competed away by faster participants. The information that is not yet fully priced tends to live away from the screen entirely: in the slower, structural, less-watched record of what institutions and states are actually doing. The over-mining we ran into is not a feature of markets in general. It is a feature of looking where everyone else is already looking.

A retail-access trading system is a liquidity taker that pays the spread, while the firms that win intraday are liquidity makers that collect it. That structural asymmetry — not a shortage of cleverness or data — is the moat the system could not cross.

Why this was the right call

The day-trader was retired because no standalone, cost-positive intraday edge existed for the universe and execution the team could reach, and the structural reasons for that will not change with one more clever signal.

What matters is how it died. It did not blow up and it did not bleed out. It printed a +6% week and was turned off anyway, because the same framework that demoted dozens of confident-looking false positives showed plainly that the +6% was a coin landing heads, not a coin that was weighted.

Quitting after a drawdown is easy. Walking away from a strategy that just made money, because the evidence says the money was luck, is the entire discipline — and it is the rarest thing in the business.

The graveyard is large because the standard was high, and that standard is the only reason the next thing built will be worth believing.

Written in the spirit the whole project was run: evidence before assertions, and an honest null over a flattering number — even when the flattering number is our own.

Frequently Asked Questions

Did Valan's trading model make money?

Yes. The system returned +5.95% in its first week of live paper trading, across 13 trades with 62% winners and a median trade of +$49.75. It was shut down anyway, because the validation framework had already shown the profit was statistically indistinguishable from luck.

Why did Valan shut down a profitable trading strategy?

Because a +6% week of only 13 trades carries almost no information. A coin flip with no edge clears a 62% win rate about one week in three, and a single QQQ trade contributed 36% of the profit. Whether a strategy keeps running is decided by years of validation research, not one week of live trades — and that research had already found no standalone, cost-clearing intraday edge.

Why is finding alpha in intraday markets so difficult?

Cost is the binding constraint, not signal discovery. Real signals were found repeatedly and then died at the bid–ask spread. A retail-access system is a liquidity taker that pays the spread, while the firms that win intraday are liquidity makers that collect it — a structural disadvantage, not a shortage of cleverness or data.

What tests did Valan use to validate trading signals?

A deliberately severe gauntlet: a spread-bounce test that enters at the next bar's open to expose fake edge manufactured from the spread, a beta-strip that regresses out the market, a cost-and-turnover gate that computes the break-even spread in basis points, a luck haircut using a deflated Sharpe ratio plus a placebo shuffle test and bootstrapped confidence intervals, calendar-time portfolios, and embargoed walk-forward testing — across 182 pre-registered auditions.

Was any real money at risk?

No. The system ran on a paper (simulated) account only. No real capital was ever deployed and no live orders were placed beyond the simulated account.

What does a trading experiment have to do with procurement intelligence?

The same principle drives both. Price on the most liquid names is the single most over-mined dataset in finance, so the information that is not yet priced lives in the slower, structural record of what institutions and states actually do — which is the public procurement and contract data that Valan Technologies specialises in.