Customer retention over the years using DAX in Power BI

Data source used for this post is the ‘Internet Sales’ data mart of the ‘AdventureWorksDW2017´ SQL server database.

In this post, we will write and use DAX formulas to get the customer retention count when compared with last year and when compared with all the previous years. In addition, we will also look at testing the DAX results using SQL.

DAX formula to calculate the distinct customers who had a sale record in the current year in comparison:

Customer Count = 
 DISTINCTCOUNT
             (
                 'Internet Sales'[CustomerKey]
             )

DAX formula to calculate the year-on-year customer retention count – customers who had a sale record in the last year and in the current year in comparison:

Retained YOY Customer Count = 
 COUNTROWS
 ( 
 CALCULATETABLE
     ( 
         DISTINCT( 'Internet Sales'[CustomerKey] ),
         CALCULATETABLE
         ( 
             DISTINCT( 'Internet Sales'[CustomerKey] ),
             SAMEPERIODLASTYEAR('DateDimension'[Date])
         )               
     )          
 )

DAX formula to calculate the customer retention count over the years – customers who had a sale record in the previous years and in the current year in comparison:

Retained Old Customer Count = 
 COUNTROWS
 ( 
 CALCULATETABLE
     ( 
     DISTINCT( 'Internet Sales'[CustomerKey]  ),
     CALCULATETABLE
         ( 
             DISTINCT( 'Internet Sales'[CustomerKey]),
             DATESBETWEEN
                 (
                     'DateDimension'[Date],
                     BLANK(),
                     DATE(YEAR(MIN(DateDimension[Date]))-1,12,31)
                 )
         ) 
     )          
 )    

Below are the results of the DAX formulas when used in a Power BI dashboard.

SQL query to check the above results in the data warehouse directly.

BEGIN

 IF OBJECT_ID('Tempdb..#TempFIS_subset') IS NOT NULL 
 DROP TABLE #TempFIS_subset
;

 SELECT 
     FIS.[CustomerKey], 
     YEAR(dd.[FullDateAlternateKey]) [OrderYear]
 INTO #TempFIS_subset
 FROM [dbo].[FactInternetSales] FIS WITH(NOLOCK) 
     INNER JOIN [dbo].[DimDate] dd WITH(NOLOCK)
         ON FIS.[OrderDateKey] = dd.[DateKey]
 ;

 CREATE NONCLUSTERED INDEX [IX_CustomerKey] 
 ON #TempFIS_subset([OrderYear])
 INCLUDE ([CustomerKey])
 ;

 SELECT    
 FS.[OrderYear] AS [Calendar Year]
 ,COUNT(DISTINCT FS.[CustomerKey]) AS [Customer Count]
 ,COUNT(DISTINCT YOY.[CustomerKey]) AS [Retained YOY Customer Count]
 ,COUNT(DISTINCT RetainOld.[CustomerKey]) AS [Retained Old Customer Count]
 FROM #TempFIS_subset FS 
     LEFT JOIN #TempFIS_subset YOY
         ON FS.[OrderYear]-1     = YOY.[OrderYear]
         AND FS.[CustomerKey]    = YOY.[CustomerKey]
     LEFT JOIN #TempFIS_subset RetainOld
         ON FS.[OrderYear]       > RetainOld.[OrderYear]
         AND FS.[CustomerKey]    = RetainOld.[CustomerKey]
 GROUP BY FS.[OrderYear]
 ORDER BY FS.[OrderYear]
 ;

 IF OBJECT_ID('Tempdb..#TempFIS_subset') IS NOT NULL 
 DROP TABLE #TempFIS_subset
;

 END

Below are the SQL results:

Above verification concludes this post. Hope this helps!

Leave a comment