In early 2026, a friend running a sports card live-commerce business on Whatnot asked if I'd dig into his operational data. He'd been tracking each show in a spreadsheet — revenue, product cost, category, hours live, follower count. The audience had grown from a standing start to about 7,000 followers in roughly four months. He had instincts about what was working. He wanted to know if the data backed them up.
He shared the spreadsheet informally between friends, with permission to use anonymized findings for a portfolio writeup. I pulled the data into Python as a self-directed analytics learning project. The agreement: specific dollar amounts and customer-level data stay private — only the shape of the analysis and percentage-based or indexed findings get discussed publicly. This page is the public-safe version.
What the dataset looked like
Sports card breaking is a niche but fast-growing live commerce category. The friend running the account buys sealed product (a case of basketball cards, a box of soccer cards), goes live on Whatnot, and sells slots in real time — each viewer buys a team or a player and receives whatever cards from that team are pulled. Part lottery, part live sport, part community.
The dataset spanned about four months and 56 streams across four sports verticals (basketball, soccer, baseball, football) plus mixed shows. Stream count ramped quickly. Audience grew alongside it. The question I wanted to answer: which decisions inside that period showed up in the data as the strongest growth supporters?
What looked like it was supporting growth
The implication for planning: keep a rolling 6-week sports calendar on the wall, and front-load high-volume product into the windows the league is going to do the marketing for you. Don't try to make a quiet midweek baseball show in February into the moment.
The recommendation set
I wrote the analysis as a one-page diagnostic for the operator. The recommendations weren't dramatic — they were the kind of small, repeated decisions that compound:
Maintain a rolling sports-calendar plan and lean into momentum windows. Keep basketball as the audience anchor; sequence diversification rather than spread it thin. Treat the 3–5 hour window as the default show length, with deviation reserved for high-momentum drops. Build a lightweight predictive model that estimates expected new-follower velocity before a show airs — not to control behavior, but to flag shows that might come in below expectations.
What the data also showed — the diagnostic side
The growth story was real. Alongside it, the data also showed the kind of efficiency tension that's normal in any fast-scaling business: per-hour profit declined as volume grew, soccer hours had a higher cost-of-goods ratio than the rest of the catalog, and short shows ran below the line. These are the diagnostics any operator needs to see early.
The framing matters. Growth and efficiency aren't the same metric. Both can be true at the same time: a business can be acquiring an audience faster than ever and still have unit economics that need attention. The job of the analytics is to make both visible — not to pick a side.
What I'm building next
A natural next step is a predictive model in Python. Inputs: category, planned length, day of week, time of day, sports-calendar context, anchor product cost. Output: expected new-follower velocity and a confidence band. Goal: practice the modeling pipeline on a real dataset with the messy properties — small sample, mixed types, seasonality — that toy datasets don't have.
Why this case study, honestly
I'm pivoting into product analytics, and the standard "Kaggle Titanic" portfolio project is limited — clean data, well-known answers, no real consequences. This case study has the opposite properties. The dataset is messy and small. The findings have stakes for someone I know. The questions don't come pre-packaged; you have to figure out which ones are worth asking.
That kind of practice — where being right and being wrong both have consequences — is what I think you can only get from working on real data with someone who cares about the answer. It's also exactly the kind of work I want to be doing in a product analytics co-op next year. More on the broader plan here.