In early 2026, a friend running a sports card live-commerce business on Whatnot asked if I'd dig into his operational data. He'd been tracking each show in a spreadsheet — revenue, product cost, category, hours live, follower count. The audience had grown from a standing start to about 7,000 followers in roughly four months. He had instincts about what was working. He wanted to know if the data backed them up.

He shared the spreadsheet informally between friends, with permission to use anonymized findings for a portfolio writeup. I pulled the data into Python as a self-directed analytics learning project. The agreement: specific dollar amounts and customer-level data stay private — only the shape of the analysis and percentage-based or indexed findings get discussed publicly. This page is the public-safe version.

What the dataset looked like

Sports card breaking is a niche but fast-growing live commerce category. The friend running the account buys sealed product (a case of basketball cards, a box of soccer cards), goes live on Whatnot, and sells slots in real time — each viewer buys a team or a player and receives whatever cards from that team are pulled. Part lottery, part live sport, part community.

The dataset spanned about four months and 56 streams across four sports verticals (basketball, soccer, baseball, football) plus mixed shows. Stream count ramped quickly. Audience grew alongside it. The question I wanted to answer: which decisions inside that period showed up in the data as the strongest growth supporters?

What looked like it was supporting growth

Pattern 01
Sports momentum — running the right product when the league is loud.
The strongest viewer-acquisition shows clustered around moments of real-world sports momentum: NBA tip-off and All-Star windows for basketball, EPL/Champions League weeks for soccer, MLB Opening Day and trade-deadline windows for baseball. Shows that lined up with these windows showed consistently higher new-follower velocity per live hour than off-cycle shows of the same product. Live commerce reads the calendar more than the calendar reads the price tag.

The implication for planning: keep a rolling 6-week sports calendar on the wall, and front-load high-volume product into the windows the league is going to do the marketing for you. Don't try to make a quiet midweek baseball show in February into the moment.

Pattern 02
Category mix — basketball as the audience anchor.
Across the four months, basketball was the most efficient acquisition vertical: roughly 30% of total live hours, 55% of profit, and the steepest follower-add curve. Soccer had higher hours but a lower follower-add rate per hour. The growth pattern that emerged: basketball anchored the audience, soccer and football diversified it once the base was steady. Sequencing matters — build base before you diversify.
Hours allocation by category · indexed
Basketball
100
Soccer
113
Baseball
65
Football
50
Pattern 03
Show length discipline — the 3–5 hour band carried most of the value.
Bucketed by length, longer wasn't always better. Streams under 3 hours rarely cleared their fixed costs — setup, audience warm-up, and platform fees take time to absorb. The 3–5 hour band consistently delivered the strongest combination of profit and follower gain. Beyond five hours, both efficiency and audience attention started to slip. The simple rule that emerged: plan for the 3–5 hour window unless you have a specific high-momentum drop that earns more.

The recommendation set

I wrote the analysis as a one-page diagnostic for the operator. The recommendations weren't dramatic — they were the kind of small, repeated decisions that compound:

Maintain a rolling sports-calendar plan and lean into momentum windows. Keep basketball as the audience anchor; sequence diversification rather than spread it thin. Treat the 3–5 hour window as the default show length, with deviation reserved for high-momentum drops. Build a lightweight predictive model that estimates expected new-follower velocity before a show airs — not to control behavior, but to flag shows that might come in below expectations.

What the data also showed — the diagnostic side

The growth story was real. Alongside it, the data also showed the kind of efficiency tension that's normal in any fast-scaling business: per-hour profit declined as volume grew, soccer hours had a higher cost-of-goods ratio than the rest of the catalog, and short shows ran below the line. These are the diagnostics any operator needs to see early.

The framing matters. Growth and efficiency aren't the same metric. Both can be true at the same time: a business can be acquiring an audience faster than ever and still have unit economics that need attention. The job of the analytics is to make both visible — not to pick a side.

What I'm building next

A natural next step is a predictive model in Python. Inputs: category, planned length, day of week, time of day, sports-calendar context, anchor product cost. Output: expected new-follower velocity and a confidence band. Goal: practice the modeling pipeline on a real dataset with the messy properties — small sample, mixed types, seasonality — that toy datasets don't have.

A note on data, attribution, and discretion. The findings on this page use percentages, indexed values, and per-unit metrics drawn from the underlying data. Specific dollar amounts and customer-level data remain private and are not shared externally. The shape of the analysis is what matters here, not the absolute numbers. This is a self-directed portfolio analytics learning project — not a paid engagement, employment relationship, consulting arrangement, or partnership. Data was shared informally between friends with permission to use anonymized findings. Analytics did not cause the growth on its own; the operator's day-to-day work, product picks, and on-camera presence did. Analytics surfaced patterns that supported sharper planning around which decisions to repeat.

Why this case study, honestly

I'm pivoting into product analytics, and the standard "Kaggle Titanic" portfolio project is limited — clean data, well-known answers, no real consequences. This case study has the opposite properties. The dataset is messy and small. The findings have stakes for someone I know. The questions don't come pre-packaged; you have to figure out which ones are worth asking.

That kind of practice — where being right and being wrong both have consequences — is what I think you can only get from working on real data with someone who cares about the answer. It's also exactly the kind of work I want to be doing in a product analytics co-op next year. More on the broader plan here.