Cohort analysis in 5 minutes
A cohort is a group of users who joined the product at the same time. Cohort analysis tracks how well different groups stick around a week, a month, six months later. It is the most honest view of retention you can build.
What a cohort actually is
Easiest analogy: a school class. Everyone who started first grade in September 2020 is one cohort. You can ask how many of them stayed through eleventh grade. Same in a product: everyone who signed up in March is one cohort, and a month or six months later you look at how many are still active.
Why not averages? Product-wide averages hide the fact that new users churn the next day while old users stick for years. Cohorts separate those effects.
What it looks like in a table
Each row is a cohort (the month they joined). Each column is months since signup. Cell colour shows the share of the cohort still active. Greener = more retained.
| Cohort | Size | Months since signup | ||||||
|---|---|---|---|---|---|---|---|---|
| M0 | M1 | M2 | M3 | M4 | M5 | M6 | ||
| Янв | 1 200 | 100% | 62% | 48% | 42% | 39% | 37% | 36% |
| Фев | 1 450 | 100% | 64% | 50% | 44% | 40% | 38% | |
| Мар | 1 380 | 100% | 66% | 52% | 46% | 42% | ||
| Апр | 1 620 | 100% | 68% | 54% | 48% | |||
| Май | 1 750 | 100% | 70% | 56% | ||||
| Июн | 1 900 | 100% | 71% | |||||
The same table as a curve
Average each column and you get a single retention curve. A healthy curve flattens out — it means the product has a stable user base. A leaky curve drops toward zero.
What good actually looks like
Numbers depend on the product. Below are ballpark figures from open sources (Sequoia, Lenny's Newsletter, Mobile Dev Memo).
- D1: 25–35%
- D7: 10–15%
- D30: 4–7%
- Month 1: 60–75%
- Month 6: 35–50%
- Year 1: 25–40%
- Year 1: 20–40%
- Top quartile: 50%+
Definitions and formulas live in the glossary.
Where cohorts mislead you
Mixing channels in one cohort
If 80% of April traffic was paid and May traffic was organic, the cohort curves will differ because of the source, not the product. Build separate tables per channel or you compare apples and oranges.
What does "active" actually mean?
Logging in and making a paid transaction are very different curves. In a bank, D7 retention on "opened the app" can be 60% while "made a transfer" is 12%. Define "active" by the action that creates business value.
Seasonality hides what is real
A December e-commerce cohort will show a fantastic D7 because users came back for holiday shopping. By January-February the pattern resets. Compare cohorts with seasonal context, or look at 12-month curves.
Cohorts too small to read
A 50-user cohort has ±10 percentage points of noise. At that scale, D7=20% and D7=30% are statistically indistinguishable. Pool small cohorts or wait for more data.
Want to build these tables for your own product? Get in touch.