The Coupon Collector

The problem
n different coupons · 1 per box · uniformboxes needed to collect all n coupons?

A cereal company hides one of nn different coupons in every box, each one equally likely. You want to collect all nn coupons.

On average, how many boxes do you need to buy?

Tempting (but wrong)
guess ≈ n boxes? 2n? linear?guess nguess 2nunderestimates the long tail of the last few

Two common wrong answers:

  • "Around nn boxes" — you'd think on average each box gives you a new one, so nn boxes should be enough.
  • "2n2n or 3n3n boxes" — accounting for some duplicates, you'd guess maybe double or triple.

Both badly underestimate the long tail of the last few coupons. Once you've collected 95 of 100, almost every box is a duplicate of something you have. That last 5% takes ages.