Authors’ Note: This is the first of a three-part series on collecting relevant data so that “customer due-back” can be predicted; this information can be used by all marketing programs. Correctly modeling customer due-back allows the operator to differentiate between customers who should be targeted for the next campaign and who should be left alone.
Marketing departments are challenged to ensure that the right level of incentive is sent to every customer. One of the critical pieces of information is when customers are “due back.” The approach we cover in this article describes how you can predict the expected date that a customer or group of customers should or will visit next. We will discuss some of the challenges, such as how to handle variations in frequency patterns, and introduce how this due back prediction method can be combined with clustering methods to give an analysis of the visitation behavior of a group of customers who belong to the same segment.
In order to develop a predictive model, appropriate data first has to be collected. It would not be possible to collect data for modeling “customer due-back” without the loyalty cards now offered by grocery stores, DVD rental stores, movie theaters, and of course, casinos, to name a few. Loyalty programs started in the early 1980s with American Airlines’ Frequent Flyer Program.1 The hotel industry soon followed suit, with Marriott launching its Honored Guest Awards program, in which customers’ rewards were based on the total number of days and nights they spent in the hotel per trip. Grocery store chains started loyalty programs in the early 1990s, with Tesco launching its Clubcard loyalty program in the U.K. and Albertsons and CVS respectively launching the Preferred Savings and ExtraCare programs in the U.S. The casino industry was not to be left behind, and in September 1997, Harrah’s Total Gold2 became the first loyalty program offered by a gaming company. Harrah’s share of customers’ gaming budget increased from 36 percent to 50 percent after the launch of Total Gold. Its revenue and profit numbers also increased steadily after the launch.
Almost every major casino now has a loyalty card program and routinely collects trip-level data on its customer base. Casinos use this data to determine their marketing efforts for customer segments and individual customers. This selection of customers is made based on their profitability to the casino, as determined by customer selection metrics.3 In the next section, we will discuss a few of these metrics and explain what these metrics have to do with developing a predictive model for due-back, in case you have been wondering about that.
Frequency analysis is a wonderful twist on classical analytical techniques. In analysis we often consider time to be the fundamental independent variable, so we often graph the visitation patterns of customers. We can enhance this analysis by classifying the data based on frequency. The following example shows the results of this kind of frequency analysis.
First let’s take a look at the total visits of all customers, shown in Figure 1. The graph shows that customer visitation peaked in month 3, then again in months 6 and 7.
The segmentation in Figure 2 (see below) has been built by classifying different frequency patterns of the visits in Figure 1. Methods for constructing this segmentation include processing the data using Radial Basis Function (RBF), Fast Fourier Transformation4 or a clustering technique5 and will be the subject of a later article in this series on customer due-back.
By graphing this data (Figure 3), we can see that the frequency patterns are quite different among customer segments. For example, Segment B has no peaks at all, and if we designed marketing programs around it, our overall peaks would not align with their behavior.
Following are brief descriptions of each of the segments:
• Segment A customers alternate their visitation months
• Segment B customers are very consistent from month to month
• Segment C customers may have a spike of visitation every five months
• Segment D customers may have a spike of visitation every nine months
This example is designed to show how clustering the frequency patterns of visitation gives us important insight into the behavior of the customer groups. These patterns can provide directly actionable insight for the marketing team. For example, when marketing to Segment A, we may wish to change their behavior to monthly visitation, but in any case, we do expect to see a visit from a given customer every two months.
Marketing Business Performance Drivers
Before we dig too deep into the measurement of due-back, let’s take a moment to look at the typical customer selection metrics. Database marketing analysts will often rank their customers based on the numerical value assigned to each customer based upon one of the following customer selection metrics6:
• Recency – Frequency – Monetary Value (RFM), which is commonly used by the mail-order industry
• Share of Wallet (SOW), which is used by the high tech segment
• Past Customer Value (PCV), which is used by the financial services industry
• Customer Lifetime Value (CLV), an advanced metric that is becoming popular in many sectors
We will next show some examples of how these metrics are calculated.
This approach first sorts all customers on the basis of Recency, then creates five bins such that each bin contains 20 percent of all customers. The customers in the top bin (Bin 5) get a score of 5, and the bottom bin (Bin 1) customers get a score of 1. This process is repeated for Frequency and Monetary Value. Several methods exist to obtain a composite RFM score7, such as R + F + M, or 3R + 2F + 1M.
Figure 4 shows the amounts Customers A and B spent in casinos during the last quarter of 2008. The SOW metric is calculated by dividing the amount spent in Casino X during that period by the total amount spent in all casinos during that period.
This metric assumes that past behavior indicates future profitability. It is calculated from the formula PCV = Σ GC x (1+r)t, where the sum (Σ) is taken over each time period t, GC represents the Gross Contribution of the customer, and r represents the discount rate (i.e., 15 percent per year, or .0125 per month).
Figure 5 shows the calculations for PCV for Customers A and B at Casino X. The Gross Contribution is assumed to be 30 percent of the total spent.
RFM, SOW and PCV are all based on past customer behavior alone and may lead to conflicting decisions. CLV, however, can be used to determine which customers should get preferential and/or personal treatment, which customers can be interacted with using cheaper methods (Internet, phone, cell phone), when to contact which customers, which customers will be more profitable in the future, and which customers to let go.
Several versions of CLV exist, but the aggregate approach version is CLV = [Σ Σ CMit / (1+d)t], where the first summation (Σ) is over all of the customers; the second summation is over all time periods; CMit is the Average Contribution margin for Customer I and Time t after adjusting for marketing costs, as determined from managerial judgment or actual purchase data; and d is the discount rate, which is a function of the cost of capital for the firm obtained from financial accounting.
CLV is an advanced, forward-looking metric that is becoming increasingly popular in many sectors. A business using the CLV approach can benefit from a predictive model for customer due-back.
Predictive Modeling of Customer Due-Back
As we mentioned earlier, casinos collect trip-level data on their loyal customers. Some of the variables recorded are player ID, trip start and end dates, total comps given, and room comps given. Figure 6 shows an example of data extracted from the casino database, and Figure 7 plots the number of “days since last trip” for Player 147.
One way to model customer due-back is to fit a regression model to the calculated variable Y = days since last trip = the number of days since the last trip as a function of the month number (1 = January; 12 = December).
A polynomial regression with a sufficiently large degree will probably fit this type of data. Since the dataset in the example is very small, we will use the radial basis function (RBF) approach and the method of least squares to fit the following model to this data:
The statistical software MINITAB was used to fit the above model to the data, which resulted in the following predictive model:
Y = 42.6 + 90211 f1 – 2212 f2 (R2 = 97.7%)
Predictor Coef SE Coef T P
Constant 42.579 1.067 39.90 0.000
f1 90211 15037 6.00 0.004
f2 -2212.1 233.4 -9.48 0.001
A high value of R2, along with low P-values for each predictor, implies that the fitted model is good and can be used to predict the number of days since last trip for each customer. Figure 8, which shows a plot of the observed and fitted values of Y, also indicates that the fit is good.
Following the construction of the regression model for all customers, we can then measure when players deviate from the model. For example, we could say that, compared to the model, these customers are overdue for a visit and could then consider a re-activation program.
By classifying and analyzing frequency patterns, we have illustrated how we can build models that show the quite-different dynamics of customer behavior. Once understood, these customer dynamics can be used for personalized marketing activities.
Analysis of due-back brings the art and science of marketing together; it shows the science of who is behaving outside of the model and challenges the art of marketing with how to respond appropriately. As with many areas that combine both art and science, there is subjectivity, and hence debate, about the best methods, but that is what makes it fun and exciting.
1 V. Kumar (2008). Managing Customers by Profit. Wharton School Publishing, University of Pennsylvania.
3 V. Kumar (2008).
4 Refer to http://en.wikipedia.org/wiki/Fast_Fourier_transform.
5 Galit Shmueli, Nitin R. patel, and Peter C. Bruce (2007). Data Mining for Business Intelligence. Wiley – Interscience.
6 V. Kumar (2008).