In the spring of 2026, a friend who runs a small sports card breaking business on Whatnot asked if I'd take a look at his operational data. He'd been tracking each stream in a spreadsheet — revenue, product cost, fees, net profit, hours live — and had a feeling that something wasn't right. The numbers were growing. He wasn't sure they were getting better.
I asked if I could pull the data into Python as a personal learning project, look at the patterns, and share what I found. He said yes, with the understanding that the numbers themselves stay private — only the shape of the analysis and percentage-based findings get talked about publicly. This page is the public-safe version.
What I was looking at
Sports card breaking is a niche but growing live commerce category. The operator buys sealed product (a case of basketball cards, a box of soccer cards), goes live on Whatnot, and sells slots in real time — each viewer buys a team or a player and receives whatever cards from that team are pulled from the pack. Part lottery, part live sport, part community.
The dataset spanned four months and 56 streams. Stream count tripled month-over-month. Revenue tracked up almost in lockstep. From thirty thousand feet, the business looked like it was working. Once I started looking at per-stream efficiency, the picture changed.
The first finding — margin compression
This is the kind of pattern that's hard to feel from the inside. Revenue going up creates a real sense of progress. The owner feels busier, sees bigger deposits, ships more orders. But every additional hour live was returning less. The next question was why.
The second finding — category mix
Soccer was a deliberate bet. The operator had been trying to diversify from a basketball-heavy baseline, and there's a reasonable case that soccer streams were doing acquisition work that didn't show up in the stream-level P&L — new buyers might come in for a soccer break and stick around for basketball. Maybe.
But the effect was big enough that it deserved a real test rather than an assumption. The recommendation that came out of the analysis: cap soccer at ~30% of monthly hours until the cost ratio could be investigated, and instrument new-buyer attribution so the acquisition story could be measured rather than assumed.
The third finding — the duration sweet spot
The deliverable
I wrote the analysis as a one-page diagnostic brief — the kind of thing a product analyst would put in front of a founder or a leadership team. Three findings, three concise recommendations:
Cap soccer at ~30% of monthly hours until the cost ratio is investigated. Set a 3-hour minimum on planned stream length unless tied to a specific high-margin drop. Build a predictive model that estimates expected profit before a stream airs — not to control behavior, but to flag streams that look like they'll come in below the line.
What I'm building next
Phase two of the case study is a predictive model in Python. Inputs: category mix, planned length, day of week, time of day, anchor product cost. Output: expected profit-per-hour and a confidence band. Goal: practice the modeling pipeline on a real dataset that has the kinds of messy properties — small sample, mixed types, seasonality — that toy datasets don't.
Why this case study, honestly
I'm pivoting into product analytics, and the standard "Kaggle Titanic" portfolio project is limited — clean data, well-known answers, no real consequences. This case study has the opposite properties. The dataset is messy and small. The findings have stakes for someone I know. The questions don't come pre-packaged; you have to figure out which ones are worth asking.
That kind of practice — where being right and being wrong both have consequences — is what I think you can only get from working on real data with someone who cares about the answer. It's also exactly the kind of work I want to be doing in a product analytics co-op next year. More on the broader plan here.