Categories
Archive

Basic Cohort Analysis

On a basic level, Cohort analysis is a useful tool to get a better understanding of the level of engagement from groups of users segmented based on their first point of contact with you. It helps us answer questions like, how long does a user stay engaged before churning from your platform, segmented by the date they came in, you can have an idea if the issues are isolated to a specific cohort or if they are system-wide. 

Given the general view that startups should be constantly running experiments, cohorts can be aligned to experiments, meaning that the behaviours of a specific cohort can be related to the experiments conducted, experiments might include a sign-on incentive or more refined ad targeting. So isolating one cohorts behaviours for another allows us to get a more accurate understanding of the effects of any experiments on a cohort.

Cohort analysis requires a more intermediate level of data analysis, its likely you’ll need tools to be able to do this without significant burden, you also need a reasonable amount of users to make this level of analysis worth the effort. However if you’re getting, lets say, 500+ DAUs, its probably worth your time to look at this even on a basic level. It should be noted Google analytics has a built in Cohort analysis feature, so that’s a quick place to start. For the purpose of this article however, I’m going to use excel as an example. 

I use two views for Cohort analysis the first view I’ll call a ‘cohort stack’

As you can see, in the cohort stack, I’m looking at each cohort as a row and looking at each elapsed day. This can help me to see if the pattern of attrition for each cohort fits a similar declining pattern, or if there are cohorts that buck the trend.

This view is the view you see in Google analytics and generally its sufficient. As you can see, its visually easy to see any anomalies to further investigate.

The other view I use I’ll call the ‘chronological cohort stack’

Here you can see its Similiar to the cohort stack, but we add the additional parameter of aligning each cohort to the date they enter, so we can see the stacks look like a staircase, the benefit of this over the normal cohort stack is that we can evaluate if the issue with a cohort is tied to a specific calendar event, for example particular change in regulations at a specific date. 

You probably want to view both in a dashboard since both provide signal on the health of various cohorts, which ultimately will allow you to make decisions on where to focus your attentions.

When reviewing the cohorts, what you want to look out for is sudden or significant drops in return rates, any big drop is a signal that something is affecting user behaviour, so if we see that the return rate of customers between week 1 and week 2 has dropped by 50% we would want to really dig into what changes between those two weeks. The cohort reporting is a signal report, meaning that it won’t tell you where to look, simply, just like the engine light on the dashboard, that there’s something wrong. Since this is the case, its important that you have additional reporting that can help you to triangulate the causes for the data discrepancies that the cohort report will show.

You want to segment the cohort analysis by week generally, though daily and monthly are also reasonable, deciding which to go for very much depends on the kind of business you have and your expectations of usage frequency, e.g. for a trading platform like binance or a Facebook you probably want to focus on daily to weekly, but for an insurance platform, monthly may be sufficient.

You can apply the cohort analysis to sign-ins, but also spend/ purchases  or even uses of a specific feature, this very much would be determined by the area of business you’re most focused on, this means you need to be effectively capturing data in raw to allow for this kind of cohort analysis from the start, so you want to ensure that your data teams are aware that this kind of data analysis will be required as your organisation grows, so they can ensure data is being captured in the raw form in a compatible way.

Leave a Reply

Your email address will not be published. Required fields are marked *