A REVIEW OF THE APPLICATION OF BERT MODEL IN TEXT CATEGORIZATION-Upubscience Publisher

A REVIEW OF THE APPLICATION OF BERT MODEL IN TEXT CATEGORIZATION

Download as PDF

Volume 3, Issue 2, Pp 10-15, 2025

DOI: https://doi.org/10.61784/wjit3028

Author(s)

Min Zou^1*, ZhongPing Wang²

Affiliation(s)

¹School of Cyberspace Security, Hubei University, Wuhan 430062, Hubei, China.

²School of Computer Science, Hubei University, Wuhan 430062, Hubei, China.

Corresponding Author

Min Zou

ABSTRACT

With the explosive growth of information on the Internet, how to efficiently and accurately process and categorize large amounts of text data has become a key issue. Currently, the Transformer model shows excellent performance in processing natural language tasks and is widely used; the BERT model derived from it also achieves excellent results and becomes an important tool in the field of natural language processing. In this paper, this study explore the application of RNN (Recurrent Neural Network), CNN (Convolutional Neural Network), AVG (Average Word Embedding), and BERT (Bidirectional Encoder Representation from Transformer), which are deep models, in Chinese news text categorization. It also overviews the current research status of text classification based on deep models in recent years, firstly, recognizes the BERT training process, secondly, introduces the specific use of BERT model in the field of Chinese news classification, and finally summarizes this paper and outlines the future research and development trend of BERT model in the field of Chinese news.

KEYWORDS

BERT model; Text categorization; Pre-training; Review

CITE THIS PAPER

Min Zou, ZhongPing Wang. A review of the application of BERT model in text categorization. World Journal of Information Technology. 2025, 3(2): 10-15. DOI: https://doi.org/10.61784/wjit3028.

REFERENCES

[1] Yu Tongrui, Jin Ran, Han Xiaozhen, et al. A research review of pre-training models for natural language processing. Computer Engineering and Applications, 2020, 56(23): 12-22.

[2] Zheng Yuanpan, Li Guangyang, Li Ye. A research review on deep learning in image recognition. Computer Engineering and Applications, 2019, 55(12): 20-36.

[3] Cheng Yan, Yao Leibo, Zhang Guanghe, et al. Multi-channel CNN and BiGRU for text sentiment propensity analysis based on attention mechanism. Computer Research and Development, 2020, 57(12): 2583-2595.

[4] Duan Dandan, Tang Jashan, Wen Yong, et al. A short Chinese text classification algorithm based on BERT model. Computer Engineering, 2021, 47(01): 79-86.

[5] Liu Huan, Zhang Zhixiong, Wang Yufei. A research review on the main optimization and improvement methods of BERT model. Data Analysis and Knowledge Discovery, 2021, 5(01): 3-15.

[6] Yang Pei, Dong Wenyong. A Chinese named entity recognition method based on BERT embedding. Computer Engineering, 2020, 46(04): 40-45+52.

[7] Zhang ZiNiu, Jiang Mang, Gao Jianwei, et al. Chinese named entity recognition method based on BERT. Computer Science, 2019, 46(S2): 138-142.

[8] Yue Zengying, Ye Xia, Liu Ruiheng. A review of research on pre-training techniques based on language modeling. Journal of Chinese Information, 2021, 35(09): 15-29.

[9] Wang Ting, Yang Wenzhong. A review of research on text sentiment analysis methods. Computer Engineering and Applications, 2021, 57(12): 11-24.

[10] Wu Jun, Cheng Yao, Hao Han, et al. Chinese terminology extraction based on BERT embedded BiLSTM-CRF model. Journal of Intelligence, 2020, 39(04): 409-418.