Abstract:
Customer segmentation is the process of dividing customers into different groups in order to identify their shared characteristics and values. However, in some industries, each customer can belong to multiple groups, especially in home-retailers can have a customer who members in many groups simultaneously. The contribution in this direction is three-fold: i) to propose probabilistic topic models for soft-customer segmentation, where each customer is categorized into multiple groups based on their purchased items. ii) to propose encoding the items as vectors using a nonnegative matrix factorization on binarized data (biNMF) in order to capture their contextual relationships. This is different from previous studies where bag-of-words or Word2Vec has been used. iii) employing this model on a real home retailer dataset for customer segmentation, and the results show that the model has the highest topic quality and comparable data-fitting performance compared to other topic models and other encoding methods. On a downstream task such as item recommendation, our model is able to achieve the highest average probabilistic precision value on top-20 product (Probabilistic Precision@20) among all the models whose segmented topics are reasonably diverse.