MACHINE LEARNING-ENHANCED TEXT ANALYTICS FOR EFFICIENT AUDIT DOCUMENTATION REVIEW
Volume 2, Issue 3, Pp 55-62, 2025
DOI: https://doi.org/10.61784/jtfe3056
Author(s)
Matthew Reid, Julia Stone, Paul Whittaker*
Affiliation(s)
School of Business, University of Central Lancashire, Lancashire, UK.
Corresponding Author
Paul Whittaker
ABSTRACT
Audit documentation review represents a critical yet time-intensive component of financial auditing processes, requiring extensive manual analysis of textual evidence, supporting documents, and work papers. Traditional audit documentation review methods rely heavily on manual examination and keyword-based searches, leading to inconsistent coverage, potential oversight of critical issues, and significant resource allocation challenges.
This study proposes a machine learning-enhanced text analytics framework designed to automate and improve the efficiency of audit documentation review processes. The framework integrates Natural Language Processing (NLP) techniques with supervised learning algorithms to automatically classify, prioritize, and extract relevant information from audit documentation. Advanced text mining capabilities enable the identification of risk indicators, compliance issues, and anomalous patterns within large volumes of textual audit evidence.
Experimental validation using real-world audit documentation datasets demonstrates that the proposed framework achieves 91.4% accuracy in document classification and reduces manual review time by 68%. The system successfully identifies high-risk documentation requiring detailed examination while automating the processing of routine audit materials. Implementation results show significant improvements in audit efficiency, consistency, and coverage, supporting enhanced audit quality and regulatory compliance.
KEYWORDS
Audit documentation; Text analytics; Machine learning; Natural Language Processing (NLP); Audit efficiency; Risk assessment; Document classification; Audit automation
CITE THIS PAPER
Matthew Reid, Julia Stone, Paul Whittaker. Machine learning-enhanced text analytics for efficient audit documentation review. Journal of Trends in Financial and Economics. 2025, 2(3): 55-62. DOI: https://doi.org/10.61784/jtfe3056.
REFERENCES
[1] Salijeni G, Samsonova-Taddei A, Turley S. Understanding how big data technologies reconfigure the nature and organization of financial statement audits: A sociomaterial analysis. European Accounting Review, 2021, 30(3): 531-555.
[2] Cao W, Mai N, Liu W. Adaptive Knowledge Assessment via Symmetric Hierarchical Bayesian Neural Networks with Graph Symmetry-Aware Concept Dependencies. Symmetry, 2025, 17(8): 1305.
[3] Wissem Ennouri. The impact of document auditing on operational efficiency and regulatory compliance: A case study of Globe International Business. Conference: 11ème Conférence Internationale en Economie-Gestion & Commerce International (EGCI-2024): Sousse, Tunisia. 2024. https://www.researchgate.net/publication/394414641_The_impact_of_document_auditing_on_operational_efficiency_and_regulatory_compliance_A_case_study_of_Globe_International_Business.
[4] Abshoori M. Literature Review on Process Mining in Auditing. UHasselt, Belgium. 2023. https://documentserver.uhasselt.be/handle/1942/41145?mode=full.
[5] Odetunde A, Adekunle B I, Ogeawuchi J C. A Systems Approach to Managing Financial Compliance and External Auditor Relationships in Growing Enterprises. Iconic Research And Engineering Journals, 2021, 5(4): 326-345.
[6] Allam H, Makubvure, Gyamfi, B, et al. Text classification: How machine learning is revolutionizing text categorization. Information, 2025, 16(2): 130.
[7] Sifa R, Ladi A, Pielka M, et al. Towards automated auditing with machine learning. In Proceedings of the ACM Symposium on Document Engineering 2019. Association for Computing Machinery, New York, NY, USA, 2019, 41, 1-4. DOI: https://doi.org/10.1145/3342558.3345421.
[8] Musunuru K. Big data analytics for financial auditing practices: Identification of conceptual patterns, implications and challenges using text mining. Contaduría y administración, 2025, 70(2): 1-36.
[9] Hota A. A Comprehensive Approach to Behavioral Data Analysis and Machine Learning within Unified Systems (No. rjpxs_v1). Center for Open Science, 2024, 12(5): 132.
[10] Jiang B, Wu B, Cao J, et al. Interpretable Fair Value Hierarchy Classification via Hybrid Transformer-GNN Architecture. IEEE Access, 2025, 32(1): 1084-1096.
[11] Seow R Y C. Transforming ESG Analytics With Machine Learning: A Systematic Literature Review Using TCCM Framework. Corporate Social Responsibility and Environmental Management, 2025. DOI: https://doi.org/10.1002/csr.70089. https://onlinelibrary.wiley.com/doi/abs/10.1002/csr.70089?msockid=3f2fa6d71849661d1c4ab34a190c674d.
[12] Logie J, Maroun W. Evaluating audit quality using the results of inspection processes performed by an independent regulator. Australian accounting review, 2021, 31(2): 128-149.
[13] De Groot K, Triemstra M, Paans W, et al. Quality criteria, instruments, and requirements for nursing documentation: A systematic review of systematic reviews. Journal of advanced nursing, 2019, 75(7): 1379-1393.
[14] Kirpitsas I K, Pachidis T P. Evolution towards hybrid software development methods and information systems audit challenges. Software, 2022, 1(3): 316-363.
[15] Boskou G, Kirkos E, Spathis C. Classifying internal audit quality using textual analysis: the case of auditor selection. Managerial Auditing Journal, 2019, 34(8): 924-950.
[16] Ozbaltan N. Applying machine learning to audit data: Enhancing fraud detection, risk assessment and audit efficiency. EDPACS, 2024, 69(9): 70-86.
[17] Afriyie J K, Tawiah K, Pels W A, et al. A supervised machine learning algorithm for detecting and predicting fraud in credit card transactions. Decision Analytics Journal, 2023, 6, 100163.
[18] Babalola F I, Kokogho E, Odio P E, et al. Redefining Audit Quality: A Conceptual Framework for Assessing Audit Effectiveness in Modern Financial Markets. International Journal of Multidisciplinary Research and Growth Evaluation, 2022, 3(1): 690-699.
[19] Musunuru K. Big data analytics for financial auditing practices: Identification of conceptual patterns, implications and challenges using text mining. Contaduría y administración, 2025, 70(2): 1-36.
[20] Cao W, Mai N. Predictive Analytics for Student Success: AI-Driven Early Warning Systems and Intervention Strategies for Educational Risk Management. Educational Research and Human Development, 2025, 2(2): 36-48.
[21] Bhopale A P, Tiwari A. Transformer based contextual text representation framework for intelligent information retrieval. Expert Systems with Applications, 2024, 238, 121629.
[22] LuoLe Zhou, ZuChang Zhong, XiaoMin Liang, et al. The dual effects of a country’s overseas patent network layout on its export: scale-up or quality improvement. Social Science and Management. 2025, 2(2): 12-29. DOI: https://doi.org/10.61784/ssm3046.
[23] XiaoBo Yu, LiFei He, XiaoDong Yu, et al. The generative logic of junior high school students' educational sense of gain from the perspective of "psychological-institutional dual-dimensional fairness". Journal of Language, Culture and Education Studies. 2025, 2(1): 39-44. DOI: https://doi.org/10.61784/jlces3015.
[24] XiaoBo Yu, LiFei He, XiaoDong Yu, et al. The formation mechanism and enhancement path of junior high school students’ academic gain under the background of “Double Reduction”. Educational Research and Human Development. 2025, 2(2): 30-35. DOI: https://doi.org/10.61784/erhd3041.
[25] Mai N, Cao W. Personalized Learning and Adaptive Systems: AI-Driven Educational Innovation and Student Outcome Enhancement. International Journal of Education and Humanities, 2025, 5(3): 751-760.
[26] Zheng W, Liu W. Symmetry-Aware Transformers for Asymmetric Causal Discovery in Financial Time Series. Symmetry, 2025, 16(5): 153.
[27] Rebstadt J, Remark F, Fukas P, et al. Towards personalized explanations for AI systems: designing a role model for explainable AI in auditing. Wirtschaftsinformatik Proceedings. Internationale Tagung Wirtschaftsinformatik (WI-2022), Erlangen-Nürnberg, Germany, Springer, 2022.
[28] Ji E, Wang Y, Xing S, et al. Hierarchical Reinforcement Learning for Energy-Efficient API Traffic Optimization in Large-Scale Advertising Systems. IEEE Access, 2025. DOI: 10.1109/ACCESS.2025.3598712.
[29] Adekunle B I, Chukwuma-Eke E C, Balogun E D, et al. Machine learning for automation: Developing data-driven solutions for process optimization and accuracy improvement. Machine Learning, 2021, 2(1).
[30] Cao J, Zheng W, Ge Y, et al. DriftShield: Autonomous Fraud Detection via Actor-Critic Reinforcement Learning with Dynamic Feature Reweighting. IEEE Open Journal of the Computer Society, 2025, 6, 1166-1177. DOI: 10.1109/OJCS.2025.3587001.