AN INTELLIGENT PREDICTION METHOD FOR STUDENT DEPRESSION RISK INTEGRATING ENSEMBLE LEARNING AND FEATURE ENGINEERING-Upubscience Publisher

AN INTELLIGENT PREDICTION METHOD FOR STUDENT DEPRESSION RISK INTEGRATING ENSEMBLE LEARNING AND FEATURE ENGINEERING

Volume 3, Issue 5, Pp 26-33, 2025

Author(s)

YuHao Yan^1*, LinLu Chen², JingNing Huang²

Affiliation(s)

¹School of Medical Informatics Engineering, Guangzhou University of Chinese Medicine, Guangzhou 510006, Guangdong, China.

²School of Public Health and Management, Guangzhou University of Chinese Medicine, Guangzhou 510006, Guangdong, China.

Corresponding Author

YuHao Yan

ABSTRACT

Depression has become a global public health issue, with depression risk among students showing a persistent upward trend. Traditional mental health screening primarily relies on manual interviews and questionnaire assessments, exhibiting limitations such as high subjectivity, high cost, and narrow coverage. To address this, this paper proposes an intelligent prediction method for student depression risk based on a fusion mechanism of ensemble learning and feature engineering. Using the Kaggle Open Mental Health Dataset as the experimental foundation, the study first constructs high-quality data samples by repairing missing values through multi-strategy data cleaning, KNN, and regression interpolation. Subsequently, it extracts key psychological and behavioral features using Random Forest feature importance evaluation and Linear Discriminant Analysis (LDA) supervised dimensionality reduction techniques, enhancing model interpretability and training efficiency. During model construction, a multi-model framework incorporating heterogeneous classifiers—including Deep Neural Networks (DNN), Support Vector Machines (SVM), LightGBM, CatBoost, and Random Forest (RF)—was designed. Model fusion was achieved through blending strategies such as Blending, weighted averaging, and soft voting. Experimental results demonstrate that the proposed Blending ensemble model outperforms individual models in metrics including AUC, accuracy, and recall, achieving a maximum AUC of 0.9189 and exhibiting robust performance and generalization capabilities. These findings validate the effectiveness of synergistic optimization through feature engineering and ensemble learning, providing a feasible algorithmic framework and practical pathway for constructing intelligent mental health screening systems for university students.

KEYWORDS

Ensemble learning; Feature engineering; Student depression prediction; LDA dimensionality reduction; Intelligent psychological screening

CITE THIS PAPER

YuHao Yan, LinLu Chen, JingNing Huang. An intelligent prediction method for student depression risk integrating ensemble learning and feature engineering. World Journal of Information Technology. 2025, 3(5): 26-33. DOI: https://doi.org/10.61784/wjit3064.

REFERENCES

[1] World Health Organization. Depression and Other Common Mental Disorders: Global Health Estimates. Geneva: WHO, 2022.

[2] Yang B X, Guo Y R, Hao S J, et al. Application of graph neural networks with data augmentation and ensemble learning strategies for depression detection. Computer Science, 2022, 49(07): 57-63.

[3] Teoh C-W, Ho S, Dollmat K S B, et al. Ensemble-learning techniques for predicting student performance on video-based learning. International Journal of Information and Education Technology, 2022, 12(8): 741-745.

[4] Jiang H, Hu R, Wang Y J, et al. Depression prediction in heart failure patients based on stacked models. World Journal of Clinical Cases, 2024, 12(21): 4661-4672.

[5] Vázquez-Romero A, Gallardo-Antolín A. Automatic detection of depression in speech using ensemble convolutional neural networks. Entropy, 2020, 22(6): 688.

[6] Feng W, Gou J, Fan Z, et al. An ensemble machine learning approach for classification tasks using feature generation. Connect Science, 2023, 35(1): 2231168.

[7] Pandey M, Taruna S. A comparative study of ensemble methods for students’ performance modeling. International Journal of Computer Applications, 2014, 93(8): 1-6.

[8] Sun Y, Li Z, Li X, et al. Classifier selection and ensemble model for multi-class imbalance learning in education grants prediction. Applied Artificial Intelligence, 2021, 35(4): 290-303.

[9] Khan I, Gupta R. Early depression detection using ensemble machine learning framework. International Journal of Information Technology, 2024, 16: 3791-3798.

[10] B J, R J A K, Mitra A, et al. Education data analysis using ensemble models. In: Proceedings of the 4th International Conference on Smart Systems and Inventive Technology, 2022.

[11] Owen V E, Baker R S. Fueling prediction of player decisions: foundations of feature engineering for optimized behavior modeling in serious games. Technology, Knowledge and Learning, 2020, 25(2): 225-250.

[12] Janardhan N, Kumaresh N. Improving depression prediction accuracy using Fisher score-based feature selection and dynamic ensemble selection approach based on acoustic features of speech. Traitement du Signal, 2022, 39(1): 77-90.

[13] DESGM Authors. Enhancing depression detection: a stacked ensemble model with feature selection and RF feature importance analysis using NHANES data. Applied Sciences, 2024, 14(16): 7366.

[14] Hodge V J, Austin J. A survey of outlier detection methodologies. Artificial Intelligence Review, 2004, 22(2): 85-126.

[15] Sagi O, Rokach L. Ensemble learning: a survey. WIREs Data Mining and Knowledge Discovery, 2018, 8(4): e1249.

[16] Chawla N V, Bowyer K W, Hall L O, et al. SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 2002, 16: 321-357.

[17] Onan A. A stacked ensemble approach for text-based depression detection on social media. Expert Systems with Applications, 2022, 206: 117799.