Science, Technology, Engineering and Mathematics.
Open Access

CUSTOMER SEGMENTATION AND CHURN PREDICTION BASED ON K-MEANS AND RANDOM FOREST: A CASE STUDY OF E-COMMERCE DATA

Download as PDF

Volume 7, Issue 2, Pp 14-19, 2025

DOI: https://doi.org/10.61784/ejst3071

Author(s)

ZhuoRan Li

Affiliation(s)

School of Economics, Nanjing University of Finance & Economics, Nanjing 210023, Jiangsu, China.

Corresponding Author

ZhuoRan Li

ABSTRACT

This study aims to segment customers using the application of the K-means clustering algorithm and predict customer churn using the random forest method. Transactional data were used, including the order date, customer name, region, logistics company, quantity bought, payment amount, and frequency bought. K-means clustering was applied to group customers into segments, while a random forest model was constructed to predict customer churn. K-means clustering could determine four customer segments with different purchasing habits. Random forest model could predict customer churn and could find that attributes such as payment value and region were the most significant to use while determining the probability of churn. Results of this study verify that employing K-means clustering and random forest simultaneously for customer segmentation and customer churn prediction is efficient and assists in obtaining considerable insights for precision marketing.

KEYWORDS

K-means; Random forest; Customer segmentation; Churn prediction

CITE THIS PAPER

ZhuoRan Li. Customer segmentation and churn prediction based on K-means and random forest: a case study of e-commerce data. Eurasia Journal of Science and Technology. 2025, 7(2): 14-19. DOI: https://doi.org/10.61784/ejst3071.

REFERENCES

[1] Shweta Pandey, Neeraj Pandey, Deepak Chawla. Market segmentation based on customer experience dimensions extracted from online reviews using data mining. Journal of Consumer Marketing, 2023, 40(7): 854-868. DOI: https://doi.org/10.1108/jcm-10-2022-5654.

[2] Petra Jílková. Customer Behaviour and B2C Client Segmentation in Data-Driven Society. International Advances in Economic Research, 2020, 26(3): 325-326. DOI: https://doi.org/10.1007/s11294-020-09799-9.

[3] Tiffany S, Legendre. Consumer value‐based edible insect market segmentation [edible insect market segmentation]. Entomological Research, 2020, 51(1): 55-61. DOI: https://doi.org/10.1111/1748-5967.12490.

[4] Deepak Jaiswal, Vikrant Kaushal, Pankaj Singh, et al. Green market segmentation and consumer profiling: a cluster approach to an emerging consumer market. Benchmarking: An International Journal, 2020, 28(3): 792-812. DOI: https://doi.org/10.1108/bij-05-2020-0247.

[5] Rui Zhao. CVM Model of Customer Purchasing Behavior Based on Clustering Analysis. Proceedings of the 2021 3rd International Conference on Economic Management and Cultural Industry (ICEMCI 2021). 2021. DOI: https://doi.org/10.2991/assehr.k.211209.328.

[6] Safae Bouhout,Youness Oubenaalla, El Habib Nfaoui. Comparative Study of Two Parallel Algorithm K-Means and DBSCAN Clustering on Spark Platform. Advanced Intelligent Systems for Sustainable Development (AI2SD’2020). AI2SD 2020. Advances in Intelligent Systems and Computing, 2022, 1418: 245-262. DOI: https://doi.org/10.1007/978-3-030-90639-9_20.

[7] Wolfgang Bellotti, Daniela N. Davies, Y H Wang. Improved Multi-index Customer Segmentation Model Research. International journal of smart business and technology, 2021, 9 (2): 49-64. DOI: https://doi.org/10.21742/ijsbt.2021.9.2.04.

[8] Girdhar Gopal Ladha, Ravi Singh Pippal. An efficient distance estimation and centroid selection based on k-means clustering for small and large dataset. International journal of advanced technology and engineering exploration, 2020, 7(73): 234-240. DOI: https://doi.org/10.19101/ijatee.2020.762109.

[9] Xiancheng Xiahou, Yoshio Harada. B2C E-Commerce Customer Churn Prediction Based on K-Means and SVM. Journal of Theoretical and Applied Electronic Commerce Research, 2022, 17(2): 458-475. DOI: https://doi.org/10.3390/jtaer17020024.

[10] Feng Ye. Green Progress of Cross-border E-Commerce Industry Utilizing Random Forest Algorithm and Panel Tobit Model. Applied Artificial Intelligence, 2023, 37(1). DOI: https://doi.org/10.1080/08839514.2023.2219561.

[11] Mengyuan Li. Research on the prediction of e-commerce platform user churn based on Random Forest model. 2022 3rd International Conference on Computer Science and Management Technology (ICCSMT), Shanghai, China, 2022, 34-39. DOI: https://doi.org/10.1109/iccsmt58129.2022.00014.

All published work is licensed under a Creative Commons Attribution 4.0 International License. sitemap
Copyright © 2017 - 2025 Science, Technology, Engineering and Mathematics.   All Rights Reserved.