You Can't A/B Test The Couch

How I built Redbox's UX research practice from zero — and made customer validation an expected part of how product decisions got made.

ROLE

UX Research Manager

SCOPE

Kiosk, Web, and Mobile

Redbox had no voice of the customer. Product bets moved fast — driven by A/B data, competitive fear, and executive instinct. That worked for kiosk optimization. It became expensive when the company started making adjacent bets in streaming (Redbox Instant) and ticketing while customer loyalty was quietly eroding.

I built the research function that changed how decisions got made. By the time I left, PMs expected testing before launch. Prototypes were the default design communication tool. Research had become a risk mitigation layer, not a bottleneck.

The goal was never to slow teams down. It was to stop expensive mistakes before

they became customer friction.

The Industry Pressure

The real competitor wasn't Netflix. It was the couch.

By 2013, 67% of streaming subscribers used only Netflix. Streaming was becoming the default — not because it was cheaper per movie, but because it required no effort. Customers who once made a ritual of picking up pizza, candy, and a Redbox movie were staying home. Churn data showed rental declines. It couldn't show how deeply streaming had rewired the decision-making process, or how low the switching costs had actually become.

The strategic risk was gradual erosion masked by short-term revenue stability from new kiosks being installed every hour.

ANNUAL U.S. FILMED ENTERTAINMENT REVENUE BY CATEGORY (PROJECTED) (in Billions)

Decline of DVD rentals compared to the rise of digital sales and rentals.

Source: PwC’s Entertainment & Media Outlook report, June 2015

Organizational Risk

Before the UX research department...

No formal voice of the customer program
No UX research standards or repeatable methodology
No NPS baseline or benchmark tracking
Validation happened post-launch, not before
Engineers and PMs engaged research after design, if at all, not during discovery

Metrics revealed what customers did. They couldn't explain why behaviors were changing — or how fragile loyalty had become.

We conducted intercept studies at busy locations to observe real friction and PMs were assigned a kiosk as a member of the research team.

The Structural Correction

I introduced a decision model that required behavioral confidence before engineering scale.

Old pattern:
Release → Measure → Optimize

A/B data drives decisions
Validate after release
Behavioral fit assumed
Research optional
Capital committed before workflow fit proven

New pattern:
Hypothesis → Prototype → Test in context → Evaluate confidence → Commit

This included:

Securing a $150k annual research budget across US and Canada
Building an in-house lab to make validation operational, not ad hoc
Establishing benchmark studies tied to NPS and experience quality
Embedding validation checkpoints into roadmap planning
Requiring testing for medium and high-complexity initiatives
Managing international studies in Canada
Making ADA compliant designs mandatory not nice to have

For projects of medium complexity or above, research was no longer optional. It was part of the definition of ready and repeated negative signals triggered pause, not polish.

Case Study: Credits Loyalty Program

Preventing a repeat failure.

Redbox had tried loyalty programs before. All had failed — run by Marketing, without behavioral validation. When Credits launched internally, our first round of testing immediately surfaced the problem: customers didn't understand what a credit was, how it converted to dollars, or how to access it at the kiosk.

We paused. We ran multiple rounds of testing until customers demonstrated immediate comprehension without prompting. Only then did the launch move forward.

Research didn't just improve Credits. It prevented a third failed loyalty launch — and the organizational damage that would have come with it. This was the moment validation shifted from optional to expected.

Dense content, lots of numbers and busy background contributed to a screen customers couldn't make sense of.

The final design includes information that's meaningful to customers.

Tickets: A lesson in Governance

Tickets revealed a deeper structural issue. Ticket purchasing is a high-consideration behavior. Kiosks are optimized for speed and low dwell time. Research improved usability, but the channel misalignment remained. By the time the pattern was clear, investment and engineering complexity were already committed.

The lesson was not usability. It was funding governance.

If I were accountable for innovation funding, I would implement a formal stage-gate tied to behavioral confidence.

Concepts failing repeated in-context validation would halt rather than advance under sunk cost pressure.

If core Ticket flows had been validated as a behavioral stage-gate before funding, millions in capital and development time could have been avoided.

Customers felt pressure browsing tickets while others waited in line, exposing a mismatch between high-consideration purchasing and a speed-optimized kiosk.

The Outcome

By the end of my tenure:

PMs planned research checkpoints without prompting
Prototypes became the default design handoff tool across Redbox
Engineers engaged earlier, reducing downstream rework
NPS scores improved as friction was caught before shipping
Cross-functional partners — Marketing, Innovation Lab, Customer Support — integrated into research

When a PM tried to skip research late in my tenure, they were asked — by peers, not just me — to explain why. That's what culture change looks like.

What I'd Build Next

A formal stage-gate model tied to behavioral confidence scores — applied at the innovation funding level, before concepts reach engineering roadmaps. Concept validation sprint before funding. Kill criteria: three rounds of negative signals halt the concept, not refine the execution. Cross-functional alignment on evidence before scale.

In fast-moving markets, speed without validation creates expensive noise. Structured validation creates durable advantage and focus on new opportunities.

The measure of a research function is not how many studies it runs.
It's how many expensive mistakes the organization never had to recover from.