The Journal

Returns Fraud at DTC Brands: The Margin Tax Few Track

Wardrobing, empty-box claims, and serial returners create a margin tax most DTC brands rarely line-item. What changes when AI scores returns risk.

June 22, 2026ApexifyLabs Team4 min read

E-commerceDTCReturnsMargin

Talk to us about automation

Returns Fraud at DTC Brands: The Margin Tax Few Track

Returns fraud at DTC brands is the share of refunded orders driven by wardrobing, empty-box claims, serial abusers, and bracketing patterns that standard return policies were never built to catch. National Retail Federation surveys place fraud and abuse at roughly $13 lost per $100 in returns, a tax most $5M to $30M brands never break out on their P&L.

This article looks at why returns fraud outpaces what brand finance teams track, where the cost actually shows up across operations, and what changes when AI watches the returns stream instead of a CX rep spotting patterns by feel.

What counts as returns fraud at a DTC brand?

Returns fraud is a category, not a single behavior. Most CX leads can name two or three patterns. The full list is wider.

Wardrobing. A customer buys a dress, wears it once, and returns it as new. Apparel and event-wear brands carry the heaviest exposure.
Empty-box and short-box returns. The package shows up, the system processes the refund, the contents are missing or wrong. Without a serialized intake check, the loss does not surface until inventory reconciliation, sometimes months later.
Serial returners. A small cohort of customers returns 60% to 90% of what they order. Industry analyses by Loop Returns and Returnly consistently identify single-digit percentages of buyers driving an outsized share of return volume.
Bracketing at scale. Ordering multiple sizes or colors with the explicit intent to keep one. McKinsey's apparel reporting flags bracketing as a structural cost layer, not an edge case, for fashion DTC.
Receipt and policy abuse. Returning items past the window with a forged ship date, claiming damage that did not occur in transit, or escalating to a chargeback after the refund is already issued.
Reseller returns. Bulk buyers returning unsold inventory after a promotional window, often outside the spirit of the policy.

Each pattern carries a different operational signature. The team often sees them as separate problems, which is part of why the total stays hidden.

Why is returns fraud easy to miss in the P&L?

Most DTC finance stacks treat returns as a single negative line. The refund hits revenue, the reverse logistics fee hits COGS or fulfillment, and the cycle closes. There is rarely a column that separates good-faith returns from abusive ones, and there is rarely a place where wardrobed merchandise gets written down as unsellable.

The result is that a brand can be losing four to six points of contribution margin to returns fraud and abuse without anyone naming it. The CX team feels it as workload. The 3PL feels it as receiving exceptions. Finance sees it as a stubbornly high return rate. None of those views, on their own, point at the underlying pattern.

National Retail Federation and Appriss Retail's joint consumer returns reports have placed fraud and abuse at roughly 13% to 14% of total return value across recent years. For a brand running a 25% return rate on $20M in revenue, that range maps to a serious dollar figure regardless of where exactly your business sits in it.

What does this tax actually cost a mid-size DTC brand?

The headline refund is the smallest part of the cost.

Lost COGS on non-resellable inventory. Wardrobed, scented, washed, or damaged items often cannot be resold at full price. They flow to outlet, liquidation, or write-off.
Round-trip fulfillment. Outbound shipping, inbound return label, 3PL receiving, inspection labor, repackaging, and restock fees all run twice.
CX time on disputes. Every fraudulent return that escalates to a chargeback consumes 20 to 60 minutes of senior CX time and often a representment fee from the payment processor.
Marketing waste. The CAC spent acquiring a serial returner does not amortize the same way as the CAC spent acquiring a profitable customer. Cohort math gets noisier when fraud is undifferentiated from genuine returns.
Inventory accuracy drift. Empty-box and short-box returns create phantom stock. The system thinks the unit is back; the bin is empty. Oversells and cancellations downstream are the visible symptom.
Margin on the next promo. Bracketing-heavy returns spike during seasonal launches. The promotional plan that looked accretive on a forecast can land flat once fraud-tinged returns are netted out.

The exact split varies by category. Apparel and accessories carry more wardrobing and bracketing exposure. Beauty and consumables carry more empty-box and used-product exposure. Electronics carry more swap fraud, where the customer returns a different unit than the one shipped.

How do most teams handle suspicious returns today?

On most $5M to $30M DTC brands, the workflow is informal. A CX rep notices a customer returning every order, flags it in a shared note, and either declines the next return or escalates to the head of CX. The 3PL warehouse spots a torn tag or a missing accessory and emails a photo. Finance asks once a quarter why the return rate ticked up.

That is the baseline most operators run from. It works at small scale because patterns are visible to a single human. It strains as order volume grows, as the customer file gets larger, and as the brand expands into categories or geographies where the abuse profile is unfamiliar.

What changes when AI watches the returns stream?

Picture the same returns process, but the brand has a continuous read on which returns are unusual relative to its own customer base. Each return is scored against the customer's order history, the product's wardrobing risk, the inbound package metadata, the timing relative to the policy window, and the dispute pattern attached to similar customers. The CX rep no longer has to remember which buyer returned twelve of fifteen orders last quarter. The system surfaces it on the ticket.

The result is not a refusal engine. It is a triage layer. Clean returns flow through faster, because the system has already cleared them. Suspicious returns route to a human with the context attached. Confirmed abuse patterns get policy treatment that does not require a tense customer conversation invented on the fly.

Manual vs AI-augmented returns risk scoring

Dimension	Manual review	AI-augmented review
Pattern visibility	One rep, one ticket at a time	Across the full customer file
Detection speed	When a person notices	At the moment of return initiation
Coverage of fraud types	Whatever the team has seen before	Including patterns the brand has not yet named
CX time on triage	4 to 10 minutes per suspicious case	Seconds, with context surfaced
Inventory write-down accuracy	Quarterly reconciliation	Continuous flag on intake
Chargeback prevention window	After refund, on dispute	Before refund issuance
Customer experience on clean returns	Same friction for everyone	Faster, because risk has cleared

Notice what the table does not promise. AI does not eliminate fraud, does not write the brand's policy, and does not replace the judgment call on a borderline case. It changes how quickly the brand sees a pattern, and how confidently it can act on what it sees.

Where the cost of inaction compounds

A single fraudulent return is a rounding error. The compounding pattern is what hurts.

Undetected wardrobing depresses sellable inventory, which inflates unit COGS.
Inflated COGS pushes finance to argue for tighter discounts.
Tighter discounts lift the bracketing rate, because customers hedge harder when promo windows feel scarce.
Higher bracketing fills the warehouse with returns labor that crowds out outbound throughput.
Throughput drag shows up as delayed shipping promises, which feeds the chargeback rate the brand is already trying to flatten.

NRF and Appriss reporting has tracked the share of returns flagged as fraud or abuse trending upward across the last several survey cycles. Brands that wait to look at this often discover the tax has already grown into something material.

When should a DTC brand take a closer look?

If any of the following describe your operation, the math tends to favor reviewing returns risk scoring sooner rather than later.

Apparel, accessories, or beauty with a return rate above 18%.
Average order value over $80 with frequent multi-item carts.
A small share of customers driving a disproportionate share of returns.
A 3PL flagging an increase in short-box or empty-box exceptions.
Chargeback volume that has crept up faster than revenue.
A finance team that cannot cleanly separate fraud-driven returns from product-fit returns.

The tighter any of those signals get, the more value sits in seeing the pattern in real time instead of three months later.

A note on what AI does not solve here

It does not replace the trust call your CX leader makes with a long-tenured customer. It does not write a policy that fits your brand voice. It does not negotiate with your payment processor on representment terms. What it does is take the pattern-matching work off a tired team, surface the cases that actually warrant a human, and give finance a number it can defend.

That is the actual change. Not a zero-fraud brand. A brand that knows its real return economics and protects them on purpose.

Where this lands for a DTC operator thinking it through

Returns fraud is one of those line items that looks small in the monthly close and large in the annual one. The brands that get hurt are rarely the ones that ignore returns. They are the ones who assumed their return rate was a product problem, never separated the fraud signal from the noise, and learned how much margin had shifted when they finally did.

We run a completely free automation audit for DTC brands that want a clear read on what their returns stream is actually costing them. No commitment, no slide deck, no upsell. → Book the audit