How to write SQL to calculate user cohort retention
User retention is a meaningful metric to measure how long users come back to use our product. Measure this metric can help understand how products fit and can help you analyze behaviors of each user group by location, gender, or age.
This tutorial provides a way to query the database to generate a cohort report. In this tutorial, I’m using PostgreSQL. It should be similar to other SQL database engines.
Pre-conditions
Assume that we have histories table.
We want to query DAILY retention from 2021–05–22T00:00:00 to 2021–05–26T23:59:59.999Z
+-----------+-------------------------+
| person_id | start |
+-----------+-------------------------+
| 1 | 2021–04–30 00:43:17.927 |
| 1 | 2021–05–22 00:43:17.927 |
| 1 | 2021–05–23 00:43:17.927 |
| 1 | 2021–05–25 00:43:17.927 |
| 2 | 2021–05–23 00:43:17.927 |
| 2 | 2021–05–24 00:43:17.927 |
| 2 | 2021–05–26 00:43:17.927 |
| 2 | 2021–05–27 00:43:17.927 |
| 3 | 2021–05–19 00:43:17.927 |
| 3 | 2021–05–23 00:43:17.927 |
+-----------+-------------------------+





