Why shouldn't I judge a cold email A/B test by open rate?

Open tracking is unreliable — privacy features, image blocking, and inbox prefetching make the number noisy, and bots inflate it. More importantly, an open isn't intent. Judge tests by replies and app-ins (completed applications with bank statements) instead, because those are the stages that actually move toward funding.

How much volume do I need for a valid cold email test?

Enough that the difference between versions can't be explained by chance. A few hundred sends will hand you a 'winner' that flips next week. This is why scale matters: sending to tens of thousands of merchants a month lets a split test reach the volume needed to show a lift that actually holds up.

What is a false positive in email split testing?

It's a result that looks like a win but isn't — usually from stopping a test early, splitting the audience non-randomly, letting deliverability differ between versions, or just a lucky week. The fix is to run tests to a planned volume, randomize the split, hold infrastructure constant, and re-run important winners before committing.

How often should I run cold email tests?

Treat it as an ongoing habit, not a one-off. A monthly cadence works well: each new campaign set tests one or two clean hypotheses, winners roll into the baseline, and the next set tests against that improved version. That compounding is how a template gets measurably sharper over a year.

A/B Testing Cold Email (MCA Guide)

Key takeaways

Test one variable at a time. Change two things and a lift tells you nothing about which one caused it.
Judge tests by replies and app-ins, not opens. Open tracking is unreliable and easy to game — inbox placement and intent are what fund deals.
You need real volume before declaring a winner. Small samples produce confident-looking results that vanish the next week.
Treat optimization as a monthly habit, not a one-off. Each campaign set should test a fresh hypothesis and feed the next.

On this page

01Test one variable at a time (or you learn nothing)
02What to test in cold email — in priority order
03Judge by replies and app-ins — not opens
04You need enough volume to trust the result
05Avoid the false positives that fool everyone
06Make it a monthly habit, not a one-off

Almost every MCA shop says it 'tests' its cold email. Almost none of it is a real test. The usual version: rewrite the whole email, send it to whoever's left on the list, watch the open rate for a day, and crown a winner. That isn't A/B testing — it's guessing with extra steps.

Done properly, A/B testing cold email is the cheapest edge you can build. You're not buying more leads or more infrastructure; you're learning, week over week, which subject line, opener, and offer angle actually turns a cold merchant into an application. This guide covers what to test, how to isolate it, which metric to trust, and how to read the result without lying to yourself.

Test one variable at a time (or you learn nothing)

The single rule that separates a test from a rewrite: change one thing. If version B has a new subject line and a new first line and a new call to action, and it wins, you have no idea why. You can't repeat it, you can't build on it, and next month the 'winning' email might lose for reasons you'll never trace.

Pick one variable, hold everything else identical, and split your audience randomly between the two versions. A beats B, or it doesn't — and because only one thing changed, the result is a lesson you can actually reuse. It's slower than overhauling the whole email, but it compounds. Ten clean single-variable tests teach you more than a hundred messy ones.

What to test in cold email — in priority order

Not every variable is worth the volume it costs to test. Start at the top of the funnel, where a small change touches the most merchants, and work down toward the offer. A rough order of impact:

Subject line — decides whether the email gets opened at all. Test short vs. specific, question vs. statement, plain vs. curiosity. The highest-leverage thing to A/B test first.
First line — the preview text a merchant skims before deciding to read or delete. Test a direct opener against a softer, more personal one.
Call to action — the ask. A soft 'Are you open to seeing some rates?' almost always beats a hard 'Apply now' on cold MCA traffic. Test the temperature of the ask.
Offer angle — how you frame the value. Speed vs. approval odds vs. payment-as-a-share-of-revenue. Same offer, different lead.
Send time — day and hour. Lower-impact than copy, but cheap to test once the bigger variables are settled.

Judge by replies and app-ins — not opens

This is where most cold-email testing quietly fails. Teams pick the version with the higher open rate and move on. But opens are the worst metric you can optimize for, for two reasons.

First, open tracking is unreliable. It depends on a tracking pixel that privacy features, image blocking, and inbox-provider prefetching routinely trigger or suppress — so a chunk of your 'opens' are bots and a chunk of your real opens never register. The number is noisy at best and fictional at worst. Second, an open is not intent. A merchant who opens and deletes has cost you nothing and earned you nothing. The version that wins on opens but loses on replies is a worse email.

Judge tests by what actually moves you toward funding: replies, and ultimately app-ins — completed applications with bank statements. Those are the stages that pay. If a new subject line lifts opens but flattens replies, it didn't win; it just got more merchants to glance and leave.

You need enough volume to trust the result

A test on a few hundred sends will hand you a confident winner that's pure noise. If version A pulls eight replies and B pulls five, that gap can flip entirely on the next batch — the sample is simply too small to mean anything. Declaring a winner there isn't optimization; it's reading tea leaves.

This is the real reason cold email rewards scale. When you're sending to tens of thousands of merchants a month, a split test reaches enough people to show a difference that holds up. Statistical significance isn't academic here — it's the line between a lesson you can bank and a coincidence you'll waste next month's copy chasing. The bigger and more consistent your sending, the faster you can tell real lifts from random ones.

Avoid the false positives that fool everyone

Even with volume, it's easy to fool yourself. A few traps to design around:

Peeking and stopping early. The moment one version pulls ahead, the temptation is to call it. But results swing while data is thin — let the test run to your planned volume before judging.
Non-random splits. If version A went to your warmest, most engaged segment and B went to a cold one, you tested the audience, not the copy. Randomize the split.
Deliverability confounds. If one version lands in the inbox and the other slips toward spam, you're measuring placement, not persuasion. Hold sending infrastructure constant across both.
Testing too many things at once. Running five overlapping tests in one campaign means none of them are clean. Sequence them.
Confusing a real lift with a lucky week. A winner that doesn't repeat wasn't a winner. Re-run the ones that matter before you commit.

Make it a monthly habit, not a one-off

A single test optimizes one email. A testing system optimizes your whole program — because merchant behavior, the offers competitors are running, and inbox filters all keep moving. What won in spring can fade by fall, and the only way to know is to keep testing.

The strongest cadence is monthly: each new campaign set carries one or two clean hypotheses, the winners roll forward into the baseline, and the next set tests against that improved baseline. Over a year that compounding turns a mediocre template into one tuned by dozens of real-world results — without ever buying a single extra lead.

That discipline is built into how MCA Rocket runs. Every month brings a fresh campaign set, and because sending happens at scale across many inboxes, tests reach the volume needed to read clearly. We optimize copy against replies and app-ins — the metrics that fund deals — so the program your leads see in month six is measurably sharper than the one they saw in month one.

Back to top

About the author

Eli Pesso — Chief Rocket Man

A marketer by trade, Eli focuses his entire practice on the MCA industry — it's the niche where he believes his expertise creates the most value.

More about Eli

Keep reading

Related guides & next steps

How MCA Rocket works MCA cold email copywriting Cold email open rate benchmarks MCA cold email subject lines

FAQ

A/B Testing Cold Email (MCA Guide) — FAQ

Start with the subject line — it decides whether the email gets opened at all, so a change there touches the most merchants. Then work down the funnel: first line, call to action, offer angle, and finally send time. Test one variable at a time so you can tell which change actually moved the result.

Cold email copy

More on cold email that converts

Should You Track Opens in Cold Email? The Case Against Open-Tracking Pixels

Open-tracking pixels add a flagged image to every send, can trip spam filters and Promotions sorting, and Apple Mail makes the data unreliable anyway. Here's why you should track replies, not opens.

MCA Cold Email Copywriting: What Actually Gets Merchant Responses (the Principles Behind Every Line)

Why the best MCA cold emails read like a two-line note from a CEO's iPhone — and the principles behind subject lines, soft-ask CTAs, and per-email uniqueness that actually earn replies.

How to Present a Merchant Cash Advance Offer in Email So Merchants Actually Read It

Most MCA offers die not because the terms are bad, but because they're presented badly. Here's how to frame the factor rate, the daily payment, and the term sheet so merchants say yes.

A/B Testing Cold Email: What to Test, How to Test It, and How to Read the Results