ENHANCING NAMED ENTITY RECOGNITION VIA TEST-TIME SCALING MODEL-Upubscience Publisher

ENHANCING NAMED ENTITY RECOGNITION VIA TEST-TIME SCALING MODEL

Volume 7, Issue 2, Pp 12-17, 2025

Author(s)

JiaYi Ning^*, YiLin Cai, AiLing Hou

Affiliation(s)

Faculty of Science and Technology, Beijing Normal University & Hong Kong Baptist University United International College, Zhuhai 519088, Guangdong, China.

Corresponding Author

JiaYi Ning

ABSTRACT

This paper addresses the challenge of Named Entity Recognition (NER) using large language models (LLMs) in zero-shot and few-shot settings. While LLMs demonstrate promising capabilities, they often generate hallucinations—spurious or inaccurate outputs—that hinder reliable performance. To overcome this limitation, we propose use chain-of-thought scaling approach in which the model explicitly reasons through an inferred thought process prior to outputting final entity labels. We evaluate our method on the CoNLL-2003 and FewNERD benchmarks, demonstrating consistent performance gains over strong baseline models and attaining an F1 improvement in FewNERD from 0.45 to 0.55 in zero-shot NER. Our findings suggest that explicitly structured reasoning significantly mitigates hallucinations and enhances label precision, even without extensive task-specific fine-tuning. This work provides a blueprint for scaling and refining NER in resource-constrained scenarios, and paves the way for broader applications of reasoning-based LLM strategies to complex information extraction tasks.

KEYWORDS

Named entity recognition; Test-time scaling; Large language model; Zero-shot

CITE THIS PAPER

JiaYi Ning, YiLin Cai, AiLing Hou. Enhancing named entity recognition via test-time scaling model. Journal of Computer Science and Electrical Engineering. 2025, 7(2): 12-17. DOI: https://doi.org/10.61784/jcsee3042.

REFERENCES

[1] Zhou ZH, Chawla NV, Jin Y, et al. Big Data Opportunities and Challenges: Discussions from Data Analytics Perspectives [Discussion Forum]. IEEE Computational Intelligence Magazine, 2014, 9(4): 62–74. DOI: 10.1109/MCI.2014.2350953.

[2] Jurafsky D, Martin JH. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. 2000.

[3] Chang Y, Wang X, Wang J, et al. A Survey on Evaluation of Large Language Models. 2023. DOI: 10.48550/arXiv.2307.03109.

[4] Li B, Fang G, Yang Y, et al. Evaluating ChatGPT’s Information Extraction Capabilities: An Assessment of Performance, Explainability, Calibration, and Faithfulness. 2023. DOI: 10.48550/arXiv.2304.11633.

[5] Ma Y, Cao Y, Hong Y, et al. Large Language Model Is Not a Good Few-shot Information Extractor, but a Good Reranker for Hard Samples! 2023. DOI: 10.48550/arXiv.2303.08559.

[6] Wan Z, Cheng F, Mao Z, et al. GPT-RE: In-context Learning for Relation Extraction using Large Language Models, 2023. DOI: 10.48550/arXiv.2305.02105.

[7] Sang EF, De Meulder F. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050, 2003.

[8] Ding N, Xu G, Chen Y, et al. Few-NERD: A Few-shot Named Entity Recognition Dataset. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 2021(1): 3198–213, DOI: 10.18653/v1/2021.acl-long.248.

[9] Chiu JP, Nichols E. Named entity recognition with bidirectional LSTM-CNNs. Transactions of the association for computational linguistics, 2016, 4: 357–70.

[10] Collobert R, Weston J, Bottou L, et al. Natural Language Processing (Almost) from Scratch. Natural Language Processing, 2011, 45.

[11] Hammerton J. Named entity recognition with long short-term memory. Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003, 2003: 172–5.

[12] Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, 2019.

[13] Li X, Feng J, Meng Y, et al. A unified MRC framework for named entity recognition. arXiv preprint arXiv:1910.11476, 2019.

[14] Sarzynska-Wawer J, Wawer A, Pawlak A, et al. Detecting formal thought disorder by deep contextualized word representations. Psychiatry Research, 2021, 304: 114135.

[15] Liu AT, Xiao W, Zhu H, et al. QaNER: Prompting Question Answering Models for Few-shot Named Entity Recognition. 2022. DOI: 10.48550/arXiv.2203.01543.

[16] Yan H, Gui T, Dai J, et al. A Unified Generative Framework for Various NER Subtasks. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021(1): 5808–22, DOI: 10.18653/v1/2021.acl-long.451.

[17] Hoffmann J, Borgeaud S, Mensch A, et al. Training Compute-Optimal Large Language Models. 2022.

[18] Kaplan J, McCandlish S, Henighan T, et al. Scaling Laws for Neural Language Models. 2020.

[19] Snell C, Lee J, Xu K, et al. Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters. 2024.

[20] Welleck S, Bertsch A, Finlayson M, et al. From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models. 2024.

[21] OpenAI. Learning to Reason with LLMs. 2024.

[22] Gao Z, Niu B, He X, et al. Interpretable Contrastive Monte Carlo Tree Search Reasoning. 2024.

[23] Zhang Y, Yang J, Yuan Y, et al. Cumulative Reasoning with Large Language Models. 2024.

[24] Huang Z, Zou H, Li X, et al. O1 Replication Journey – Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? 2024.

[25] Qin Y, Li X, Zou H, et al. O1 Replication Journey: A Strategic Progress Report – Part 1. 2024.

[26] Wang P, Li L, Shao Z, et al. Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations. 2024.

[27] DeepSeek-AI, Guo D, Yang D, et al. DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning. 2025.

[28] Li C, Xue M, Zhang Z, et al. START: Self-taught Reasoner with Tools. 2025, DOI: 10.48550/arXiv.2503.04625.

[29] Bai J, Bai S, Chu Y, et al. Qwen Technical Report. 2023.