In the spring of 2026, a friend who runs a small sports card breaking business on Whatnot asked if I'd take a look at his operational data. He'd been tracking each stream in a spreadsheet — revenue, product cost, fees, net profit, hours live — and had a feeling that something wasn't right. The numbers were growing. He wasn't sure they were getting better.

I asked if I could pull the data into Python as a personal learning project, look at the patterns, and share what I found. He said yes, with the understanding that the numbers themselves stay private — only the shape of the analysis and percentage-based findings get talked about publicly. This page is the public-safe version.

What I was looking at

Sports card breaking is a niche but growing live commerce category. The operator buys sealed product (a case of basketball cards, a box of soccer cards), goes live on Whatnot, and sells slots in real time — each viewer buys a team or a player and receives whatever cards from that team are pulled from the pack. Part lottery, part live sport, part community.

The dataset spanned four months and 56 streams. Stream count tripled month-over-month. Revenue tracked up almost in lockstep. From thirty thousand feet, the business looked like it was working. Once I started looking at per-stream efficiency, the picture changed.

The first finding — margin compression

Finding 01
Profit-per-hour fell 83% as volume grew.
Indexed to the first month at 100, monthly profit-per-live-hour declined to roughly 35, then 31, then 17 by the fourth month. Same operator, same business model, dramatically different efficiency. Volume was being added faster than it was being converted into profit.
Profit per live hour, indexed (month 1 = 100)
Month 1
100
Month 2
35
Month 3
31
Month 4
17

This is the kind of pattern that's hard to feel from the inside. Revenue going up creates a real sense of progress. The owner feels busier, sees bigger deposits, ships more orders. But every additional hour live was returning less. The next question was why.

The second finding — category mix

Finding 02
Soccer was eating hours and giving back margin.
Soccer streams took 34% of total live hours but generated only 24% of profit. Meanwhile basketball took 30% of hours and generated 55% of profit. The gap traced back to product cost ratios — soccer's cost-of-goods sat about seven percentage points higher than the rest of the catalog (73% vs 66%).

Soccer was a deliberate bet. The operator had been trying to diversify from a basketball-heavy baseline, and there's a reasonable case that soccer streams were doing acquisition work that didn't show up in the stream-level P&L — new buyers might come in for a soccer break and stick around for basketball. Maybe.

But the effect was big enough that it deserved a real test rather than an assumption. The recommendation that came out of the analysis: cap soccer at ~30% of monthly hours until the cost ratio could be investigated, and instrument new-buyer attribution so the acquisition story could be measured rather than assumed.

The third finding — the duration sweet spot

Finding 03
Streams under three hours lost money on average.
Bucketed by length, the data was clean: streams under 3 hours averaged negative margin; the 3–5 hour bucket was the consistent sweet spot; 5–7 hours degraded; 7+ hours recovered but with concerning fatigue patterns. The fixed costs — setup, audience warm-up, platform fees — need at least three hours of revenue to absorb.
Profit per hour, by stream length (indexed to peak)
0–3 hrs
17
3–5 hrs
78
5–7 hrs
51
7+ hrs
100

The deliverable

I wrote the analysis as a one-page diagnostic brief — the kind of thing a product analyst would put in front of a founder or a leadership team. Three findings, three concise recommendations:

Cap soccer at ~30% of monthly hours until the cost ratio is investigated. Set a 3-hour minimum on planned stream length unless tied to a specific high-margin drop. Build a predictive model that estimates expected profit before a stream airs — not to control behavior, but to flag streams that look like they'll come in below the line.

What I'm building next

Phase two of the case study is a predictive model in Python. Inputs: category mix, planned length, day of week, time of day, anchor product cost. Output: expected profit-per-hour and a confidence band. Goal: practice the modeling pipeline on a real dataset that has the kinds of messy properties — small sample, mixed types, seasonality — that toy datasets don't.

A note on data and discretion. The findings on this page use percentages, indexed values, and per-unit metrics drawn from the underlying data. Specific dollar amounts and customer-level data remain private and are not shared externally. The shape of the analysis is what matters here, not the absolute numbers. This is a personal learning project; nothing on this page is a paid engagement, employment relationship, or formal consulting arrangement.

Why this case study, honestly

I'm pivoting into product analytics, and the standard "Kaggle Titanic" portfolio project is limited — clean data, well-known answers, no real consequences. This case study has the opposite properties. The dataset is messy and small. The findings have stakes for someone I know. The questions don't come pre-packaged; you have to figure out which ones are worth asking.

That kind of practice — where being right and being wrong both have consequences — is what I think you can only get from working on real data with someone who cares about the answer. It's also exactly the kind of work I want to be doing in a product analytics co-op next year. More on the broader plan here.