CUSTOMER SEGMENTATION AND CHURN PREDICTION BASED ON K-MEANS AND RANDOM FOREST: A CASE STUDY OF E-COMMERCE DATA
Volume 7, Issue 2, Pp 14-19, 2025
DOI: https://doi.org/10.61784/ejst3071
Author(s)
ZhuoRan Li
Affiliation(s)
School of Economics, Nanjing University of Finance & Economics, Nanjing 210023, Jiangsu, China.
Corresponding Author
ZhuoRan Li
ABSTRACT
This study aims to segment customers using the application of the K-means clustering algorithm and predict customer churn using the random forest method. Transactional data were used, including the order date, customer name, region, logistics company, quantity bought, payment amount, and frequency bought. K-means clustering was applied to group customers into segments, while a random forest model was constructed to predict customer churn. K-means clustering could determine four customer segments with different purchasing habits. Random forest model could predict customer churn and could find that attributes such as payment value and region were the most significant to use while determining the probability of churn. Results of this study verify that employing K-means clustering and random forest simultaneously for customer segmentation and customer churn prediction is efficient and assists in obtaining considerable insights for precision marketing.
KEYWORDS
K-means; Random forest; Customer segmentation; Churn prediction
CITE THIS PAPER
ZhuoRan Li. Customer segmentation and churn prediction based on K-means and random forest: a case study of e-commerce data. Eurasia Journal of Science and Technology. 2025, 7(2): 14-19. DOI: https://doi.org/10.61784/ejst3071.
REFERENCES
[1] Shweta Pandey, Neeraj Pandey, Deepak Chawla. Market segmentation based on customer experience dimensions extracted from online reviews using data mining. Journal of Consumer Marketing, 2023, 40(7): 854-868. DOI: https://doi.org/10.1108/jcm-10-2022-5654.
[2] Petra Jílková. Customer Behaviour and B2C Client Segmentation in Data-Driven Society. International Advances in Economic Research, 2020, 26(3): 325-326. DOI: https://doi.org/10.1007/s11294-020-09799-9.
[3] Tiffany S, Legendre. Consumer value‐based edible insect market segmentation [edible insect market segmentation]. Entomological Research, 2020, 51(1): 55-61. DOI: https://doi.org/10.1111/1748-5967.12490.
[4] Deepak Jaiswal, Vikrant Kaushal, Pankaj Singh, et al. Green market segmentation and consumer profiling: a cluster approach to an emerging consumer market. Benchmarking: An International Journal, 2020, 28(3): 792-812. DOI: https://doi.org/10.1108/bij-05-2020-0247.
[5] Rui Zhao. CVM Model of Customer Purchasing Behavior Based on Clustering Analysis. Proceedings of the 2021 3rd International Conference on Economic Management and Cultural Industry (ICEMCI 2021). 2021. DOI: https://doi.org/10.2991/assehr.k.211209.328.
[6] Safae Bouhout,Youness Oubenaalla, El Habib Nfaoui. Comparative Study of Two Parallel Algorithm K-Means and DBSCAN Clustering on Spark Platform. Advanced Intelligent Systems for Sustainable Development (AI2SD’2020). AI2SD 2020. Advances in Intelligent Systems and Computing, 2022, 1418: 245-262. DOI: https://doi.org/10.1007/978-3-030-90639-9_20.
[7] Wolfgang Bellotti, Daniela N. Davies, Y H Wang. Improved Multi-index Customer Segmentation Model Research. International journal of smart business and technology, 2021, 9 (2): 49-64. DOI: https://doi.org/10.21742/ijsbt.2021.9.2.04.
[8] Girdhar Gopal Ladha, Ravi Singh Pippal. An efficient distance estimation and centroid selection based on k-means clustering for small and large dataset. International journal of advanced technology and engineering exploration, 2020, 7(73): 234-240. DOI: https://doi.org/10.19101/ijatee.2020.762109.
[9] Xiancheng Xiahou, Yoshio Harada. B2C E-Commerce Customer Churn Prediction Based on K-Means and SVM. Journal of Theoretical and Applied Electronic Commerce Research, 2022, 17(2): 458-475. DOI: https://doi.org/10.3390/jtaer17020024.
[10] Feng Ye. Green Progress of Cross-border E-Commerce Industry Utilizing Random Forest Algorithm and Panel Tobit Model. Applied Artificial Intelligence, 2023, 37(1). DOI: https://doi.org/10.1080/08839514.2023.2219561.
[11] Mengyuan Li. Research on the prediction of e-commerce platform user churn based on Random Forest model. 2022 3rd International Conference on Computer Science and Management Technology (ICCSMT), Shanghai, China, 2022, 34-39. DOI: https://doi.org/10.1109/iccsmt58129.2022.00014.