This is the second of a three-part series on predicting customers’ “due-back.” Reliable prediction of when customers are due to return to the casino allows the marketing department to correctly select who to include in the next marketing campaign and to decide upon the right level of incentives to offer. This relies largely on segmentation, which is a mathematical classification system.
Because humans are natural classification processing machines, we constantly (and sometimes unwittingly) classify almost everything we see. Consider how we classify types of vehicles: cars, trucks, SUVs, vans, sports cars, etc. Most people can look at a car and, without a second thought, allocate it into its vehicle class. We may also take a step further and classify the type of driver based on that class. Computers can also classify things, though they might struggle with classifying cars. However, computers are great at producing customer segments based on similar types of behavior. These similar behaviors are translated into behavioral profiles and provide wonderful mechanisms for the marketing department to act on.
In this article, we discuss market segmentation methods and also describe how to predict the due-back of different customer segments.
What is Market Segmentation?
Market segmentation is a key element of marketing. Marketing theory and practice do not work well without correct market segmentation, and a business needs to understand how customers are similar and how customers are different in order to form customer clusters that share the same value criteria.
The effectiveness of market segmentation is determined by six criteria:1
(1) Identifiability: What is the extent to which managers can identify customer segments?
(2) Substantiality: Is the customer segment large enough to be profitable?
(3) Accessibility: How easily can the customer segments be reached through promotional efforts?
(4) Responsiveness: Is each segment’s response unique to promotional efforts?
(5) Stability: Are the customer segments stable over time?
(6) Actionability: Is the identification of customer segments helpful in making decisions on marketing efforts?
The concept of market segmentation was introduced in 1956 by Wendell Smith,2 who gave the following definition: “Market segmentation involves viewing a heterogeneous market as a number of smaller homogeneous markets, in response to differing preferences, attributable to the desires of customers, for more precise satisfactions of their varying wants.”
This definition of market segmentation has stood the test of time3 and supports the view of segmentation as an operator’s conceptual model of the market. Segments are not necessarily physical entities (e.g., ZIP codes) but are determined by researchers and managers in order to help them to better serve their customers.4
Segmentation splits a group of customers or potential customers into several clusters in such a way that the customers within a cluster share a similar interests or have similar tendencies (such as taking a trip to a casino in the same month), and customers in different clusters are dissimilar.
As is indicated by Smith’s definition, these methods were first designed to divide the customers into segments and then work with those segments over a long period of time. But because we have since evolved more business-specific methods—and often more dynamic uses—for customer segmentation, we are now likely to need to repeat the segmentation process as part of our regular marketing efforts (for example, the due-back calculation).
A casino can segment customers on the basis of their visit patterns, analyze the different segments so that segment-specific marketing programs can be designed, and then use their predictions of the next arrival for all clusters as the basis for measuring the effectiveness of the marketing program.
Next we will discuss the details of clustering to form customer segments, and then show how customer due-back for a cluster can be predicted.
Segmentation is essentially a clustering of customers, and there are many clustering methods available. In the statistics literature,5 these methods broadly fall in two categories:
(1) Hierarchical, which involves either a series of successive agglomerative mergers starting with as many clusters as the number of customers in the database, or a series of successive divisive divisions starting with one cluster of all customers.
(2) Non-hierarchical, in which the number of clusters can either be specified or determined from data (e.g., k- means clustering).
The hierarchical methods start with computing an n x n matrix of similarities (or dissimilarities) among customers, where n is the number of customers in the database. The non-hierarchical method of k-means clustering, on the other hand, does not need an n x n similarity or dissimilarity matrix, and therefore is more suitable for customer segmentation for large databases.
Segmentation methods have received a lot of attention in marketing literature as well.6 Artificial neural networks (ANN) have been used for clustering,7 and an ANN method appears to yield clusters that are more homogeneous than clusters produced by k-means or mixture model methods, as the higher dimensions can be “warped” to fit the model.
Three basic approaches to market segmentation are:8
(1) A priori segmentation, in which the user defines a basis for clustering (e.g., the customers’ favorite brand) and the initial clusters are then further split using, for example, demographics.
(2) Post hoc segmentation, in which customers are clustered using hierarchical or non-hierarchical multivariate statistical methods on the customer database.
(3) Hybrid segmentation, in which customers are first grouped according to, for example, their favorite brand, and then each group is further clustered by a hierarchical or non-hierarchical multivariate statistical method.
We will use k-means clustering in our illustrative example, since it is known to work well for large databases of customers and has, in our experience, provided excellent classifications on gaming behavioral data.
The Database Meets the Statistical Tool
Segmentation methods were once the domain of specialist tools and specialist users, and databases were once places where we could store and query data. Now vendors on either end of the spectrum (database vendors and statistical tools vendors) are offering data management storage and statistical functions as one package.
This combination of capabilities has opened the door to classification becoming a routine part of the reporting process. So the challenge today is not so much executing the segmentation process as it is building the segmentation process as part of a business’s best practices and understanding how the system works.
An Illustrative Example
We will use a simulated dataset of 1,098 customers to illustrate the process of segmentation and predict customer due-back. The dataset includes customer ID and the number of visits each month from October 2006 to August 2009. A small portion of the simulated dataset is shown in Table 1.
We used a Principal Components Analysis (PCA) to get an estimate of the number of segments or clusters present in the data. Our PCA suggested that there are four clusters in this dataset, and we have therefore used a k-means cluster analysis with four clusters. Figure 1 shows a plot of the total number of visits over the time period for each of the four customer segments, and Figure 2 shows the same as a cumulative percentage graph.
We will cover the modeling of the behavior of each cluster in more depth in a future article, but these techniques can include ARIMA models (regression models that are effective in forecasting seasonality) and neural network forecasting. These models typically require large datasets to produce reliable results. We did test our simple test data with a multiple linear regression equation fitted to the number of visits by each cluster, but the R2 value in each case was less than 50 percent, so we decided to take a simpler approach.
Cluster 1: This cluster has huge spikes in months 3 and 24. It is also generally quite low in proportion to the other clusters but is clearly capable of significant revenue. The question is how?
Cluster 2: This cluster was very strong until month 4, when customers dropped away. The question is why did they stop visiting?
Cluster 3: This cluster is similar to cluster 1, except its peak periods are months 26 and 34. The question now is why did they visit then?
Cluster 4: This cluster is similar to clusters 1 and 3, except it has a single peak in month 33. The question is, again, why did they visit then?
A Simple Approach to Analysis
These clustering results can be used for marketing. One reasonable approach would be to calculate the total number of visits by each cluster for each month of the year, and also the percent of monthly total by each cluster (see Table 2).
The decisions that marketing makes will depend on its strategies and the results of further analysis on each of the clusters. The results of its efforts can then be measured in subsequent months to see if there are variations from previous years. For example, in January, variations from 37.7 percent of visits from cluster 2 would indicate a change in behavior relative to the total customer base.
The approach we have taken shows how the power of the computer can be used to produce groups of customers with similar frequency patterns (but not clusters of vehicles). The decisions about how to interact with these groups of customers is a combination of intuition and experience. We are now truly entering the world of the database marketing team. The next step is to show how we can take one cluster, analyze it to understand its customers’ behavior, and then forecast the due-back date for each individual.
1 Wedel, Michel and Kamakura, Antonio Wagner (1999). Market segmentation: conceptual and methodological foundations, Springer.
2 Smith, Wendell. 1956. “Product Differentiation and Market Segmentation as Alternative Marketing Strategies.” Journal of Marketing. July pp 3-8.
3 Wedel, Michel and Kamakura, Wagner (2002). “Introduction to the Special Issue on Market Segmentation.” International Journal of Research in Marketing. Vol. 19. 181–183.
4 Wedel and Kamakura (1999).
5 Johnson, Richard A. and Wichern, Dean W. (2002). Applied Multivariate Statistical Analysis, 5th Edition. Prentice Hall, Upper Saddle River, NJ.
6 Editorial “Introduction to the Special Issue on Market Segmentation,” International Journal of Research in Marketing. Vol. 19 (2002). 181–183.
7 Boone, Derrick, S. and Roehm, Michelle (2002). International Journal of Research in Marketing. Vol. 19. 287–301.
8 Green, Paul E. (1977). “A New Approach to Market Segmentation.” Business Horizons. 61–73.