A SOFT VOTING MECHANISM-BASED RANDOM FOREST MULTI-MODEL ENSEMBLE APPROACH FOR EMPLOYMENT STATUS PREDICTION
Keywords:
Employment forecast, Random forest, Support vector machine, Long short-term memory, Soft voting ensemble strategyAbstract
Employment constitutes the cornerstone of people's livelihoods and the foundation of development. Presently, structural employment contradictions remain prominent in China, rendering the achievement of high-quality, full employment the foremost objective of current socio-economic development. Conventional single-model forecasting approaches exhibit limitations when handling high-dimensional, complex employment data, including insufficient generalisation capabilities and restricted capture of feature interactions. To address this, this study constructs a multi-model fusion employment status prediction model based on random forests, thereby enhancing predictive accuracy and stability. The study utilised a dataset comprising 4,980 employment samples, encompassing 57 raw features. Through data cleansing, Pearson correlation analysis, and feature extraction methods based on XGBoost and AUC cross-validation, 12 key feature variables were ultimately selected. Building upon this foundation, a soft voting ensemble strategy was employed to fuse three major prediction models—random forest, SVM, and LSTM—into a novel ensemble learning model. The predictive performance and stability of both the traditional trio of models and the new ensemble model were evaluated. Results indicate: The ensemble model achieved an accuracy of 85.49%, with recall and F1 scores of 97.34% and 93.21% respectively, outperforming each individual model. It effectively synergises the strengths of multiple models, enhancing adaptability to complex employment data and improving prediction robustness. This research provides a scientifically sound and effective ensemble learning approach for employment status prediction, offering practical value in supporting government precision policy-making and optimising employment services.References
[1] Cheng Qiyun, Sun Caixin, Zhang Xiaoxing, et al. Short-Term load forecasting model and method for power system based on complementation of neural network and fuzzy logic. Transactions of China Electrotechnical Society, 2004, 19(10): 53-58.
[2] Fangfang. Research on power load forecasting based on Improved BP neural network. Harbin Institute of Technology, 2011.
[3] Amjady N. Short-term hourly load forecasting using time series modeling with peak load estimation capability. IEEE Transactions on Power Systems, 2001, 16(4): 798-805.
[4] Ma Kunlong. Short term distributed load forecasting method based on big data. Changsha: Hunan University, 2014.
[5] Shi Biao, Li Yuxia, Yu Xhua, et al. Short-term load forecasting based on modified particle swarm optimizer and fuzzy neural network model. Systems Engineering-Theory and Practice, 2010, 30(1): 158-160.
[6] Li M, Wang F. Prediction and countermeasures of college graduates’ career destinations based on machine learning SVM algorithm: From the perspective of human capital and social capital framework. Educational Research and Experiment, 2023(5): 78-84.
[7] G R, Rajendran S, Pavul J. Exploring factors influencing female employability in Bengaluru using XGBoost. 2025 2nd International Conference on Circuits, Power and Intelligent Systems (CCPIS). Bhubaneswar: IEEE, 2025: 1-6.
[8] Roy M, Bhoi A K, Sharma K. Multimodal machine learning approaches for career prediction. 2022 International Conference on Advancements in Smart, Secure and Intelligent Computing (ASSIC). Bhubaneswar: IEEE, 2022: 1-5.
[9] Kalaiselvi B, Geetha S. Ensemble voting classifier-based machine learning model for predictive modeling of campus student placements. Science and Technology: Recent Updates and Future Prospects. Bhopal: Book Publisher International, 2024: 1-10.
[10] Huang J Q, Guo W L, Li Q Y. Research on influencing factors of college graduates' employment based on random forest model. Journal of Jiangsu Normal University (Natural Science Edition), 2019, 37(04): 55-58, 74.
[11] Dong Xibin, Yu Zhiwen, Cao Wenming, et al. A survey on ensemble learning. Frontiers of Computer Science, 2020, 14(2): 241-258.
[12] Hu C, Meng F, Luo W, et al. Fault diagnosis method for wind turbine bearings based on CEEMDAN and ISSA optimized SVM. Journal of Machine Design, 2025, 42(4): 109-119.
[13] Xue Y, Huang Y S, Yang C J. Design method for contra-rotating propellers with optimal circulation based on wake based on neural network surrogate model and genetic algorithm. Journal of Ship Mechanics, 2025, 29(4): 517-527.
[14] Yichang Municipal Bureau of Statistics. Statistical Yearbook of Yichang 2024. https://tjj.hubei.gov.cn/tjsj/sjkscx/tjnj/gsztj/ycs/202504/P020250428344682431565.pdf.
[15] Li Jiangtao, An Xingqin, Li Qingyong, et al. Application of XGBoost algorithm in the optimization of pollutant concentration. Atmospheric Research, 2022, 276: 106238.