Science, Technology, Engineering and Mathematics.
Open Access

SCENARIO CLASSIFICATION DETECTION MODEL FOR SPATIO-TEMPORAL CONTEXTUAL INFORMATION PERCEPTION IN CLASSROOM SETTINGS

Download as PDF

Volume 3, Issue 5, Pp 7-15, 2025

DOI: https://doi.org/10.61784/wjit3061

Author(s)

Jin Lu1, Ji Li2*

Affiliation(s)

1Guangdong Key Laboratory of Big Data Intelligence for Vocational Education, Shenzhen Polytechnic University, Shenzhen 518000, Guangdong, China.

2Research Management Office, Shenzhen Polytechnic University, Shenzhen 518000, Guangdong, China.

Corresponding Author

Ji Li

ABSTRACT

This paper proposes a spatio-temporal context-aware scene classification detection model tailored for classroom settings, aiming to address detection accuracy limitations arising from complex classroom environments characterised by fluctuating lighting, frequent occlusions, and the difficulty in capturing small-scale behaviours. By integrating cross-scale attention mechanisms in the spatial domain with long-term dependency modelling in the temporal domain, the model effectively captures subtle behavioural features and spatio-temporal contextual relationships between actions. Experimental results on the SCB-Dataset3 and Classroom-Actions public classroom datasets demonstrate that the proposed model achieves 85.4% scene classification accuracy and 83.2% action detection rate, representing significant improvements over mainstream methods such as YOLOv8m, CSSA-YOLO, and TACNet. Ablation studies further validate the effectiveness of each component: the spatial attention module yields a 2.1% mAP improvement, the temporal context module contributes a 4.5% mAP gain, while the scene context module delivers an additional 2.2% performance enhancement. Maintaining real-time processing speed (68.2 FPS), this model effectively addresses multi-scale detection and temporal dependency modelling challenges in classroom scenarios, providing robust technical support for smart education.

KEYWORDS

Classroom behaviour recognition; Spatio-temporal context; Attention mechanisms; Scene classification; Deep learning

CITE THIS PAPER

Jin Lu, Ji Li. Scenario classification detection model for spatio-temporal contextual information perception in classroom settings. World Journal of Information Technology. 2025, 3(5): 7-15. DOI: https://doi.org/10.61784/wjit3061.

REFERENCES

[1] Yang J, Shi G, Zhu W, et al. Intelligent technologies in smart education: a comprehensive review of transformative pillars and their impact on teaching and learning methods. Humanities and Social Sciences Communications, 2025, 12(1): 1239-1239.

[2] Sapiah S, Ulfah M S, Saputra N A, et al. Smart education in remote areas: collaborative strategies to address challenges in Majene Regency, Indonesia. Frontiers in Education, 2025, 101552575-1552575.

[3] Jain A, Dubey K A, Khan S, et al. A PSO weighted ensemble framework with SMOTE balancing for student dropout prediction in smart education systems. Scientific Reports, 2025, 15(1): 17463-17463.

[4] Xieling C, Di Z, Gary C, et al. Author Correction: Blockchain in smart education: Contributors, collaborations, applications and research topics. Education and Information Technologies, 2022, 28(7): 9267-9267.

[5] Dey A, Anand A, Samanta S, et al. Attention-Based AdaptSepCX Network for Effective Student Action Recognition in Online Learning. Procedia Computer Science, 2024: 233164-174.

[6] Taojie X, Wei D, Si Z, et al. Research on Recognition and Analysis of Teacher–Student Behavior Based on a Blended Synchronous Classroom. Applied Sciences, 2023, 13(6): 3432-3432.

[7] Vaghela R, Vaishnani D, Sarda J, et al. Optimizing object detection for autonomous robots: a comparative analysis of YOLO models. Measurement, 2026, 257(PB): 118676-118676.

[8] Song L, Zhang S, Yu G, et al. TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection. CoRR, 2019.

[9] Hua Z, Yang J, Ji W. Knowledge graph convolutional networks with user preferences for course recommendation. Scientific Reports, 2025, 15(1): 30256-30256.

[10] Wang Z, Yao J, Zeng C, et al. Students’ Classroom Behavior Detection System Incorporating Deformable DETR with Swin Transformer and Light-Weight Feature Pyramid Network. Systems, 2023, 11(7): 372-388.

[11] Pingo A, Castro J, Loureiro P, et al. Driving Behavior Classification Using a ConvLSTM. Future Transportation, 2025, 5(2): 52-52.

[12] Fu R, Tian M. Classroom Facial Expression Recognition Method Based on Conv3D-ConvLSTM-SEnet in Online Education Environment. Journal of Circuits, Systems and Computers, 2023, 33(07).

[13] Cai S, Zhang X, Mo Y . A Lightweight underwater detector enhanced by Attention mechanism, GSConv and WIoU on YOLOv8. Scientific Reports, 2024, 14(1): 25797-25797.

[14] K S, Prasad V. Design and Implementation of an Efficient Rose Leaf Disease Detection using K-Nearest Neighbours. International Journal of Recent Technology and Engineering (IJRTE), 2020, 9(3): 21-27.

[15] Paneru B, Paneru B, Sapkota C S, et al. Enhancing healthcare with AI: Sustainable AI and IoT-Powered ecosystem for patient aid and interpretability analysis using SHAP. Measurement: Sensors, 2024: 36101305-101305.

[16] Sugiharto A, Harjoko A, Suharto S. Indonesian traffic sign detection based on Haar-PHOG features and SVM classification. International Journal on Smart Sensing and Intelligent Systems, 2020, 13(1): 1-15.

[17] Kwon H B, Kim K J. Image Searching using a Cascade of HOG-kNN. ITC-CSCC :International Technical Conference on Circuits Systems, Computers and Communications, 2015.

[18] Priya V K, Peter D J. Enhanced Defensive Model Using CNN against Adversarial Attacks for Medical Education through Human Computer Interaction. International Journal of Human–Computer Interaction, 2025, 41(3): 1729-1741.

[19] Shen Q, Zhang L, Zhang Y, et al. Distracted Driving Behavior Detection Algorithm Based on Lightweight StarDL-YOLO. Electronics, 2024, 13(16): 3216-3216.

[20] Pu L, Zhao Y, Hua Z, et al. Multi-Target spraying behavior detection based on an improved YOLOv8n and ST-GCN model with Interactive of video scenes. Expert Systems With Applications, 2025: 262125668-125668.

[21] Kuppala K, Banda S, Imambi S S. Selection of Distance Measure for Visual and Long Wave Infrared Image Region Similarity using CNN Features. Procedia Computer Science, 2024: 235970-978.

[22] Deyuan Z, Haoguang W, Chao W, et al. Video Human Action Recognition with Channel Attention on ST-GCN. Journal of Physics: Conference Series, 2021, 2010(1).

[23] Lu Qi. Sports-ACtrans Net: research on multimodal robotic sports action recognition driven via ST-GCN. Frontiers in Neurorobotics, 2024: 181443432-1443432.

[24] Zhou L, Liu X, Guan X, et al. CSSA-YOLO: Cross-Scale Spatiotemporal Attention Network for Fine-Grained Behavior Recognition in Classroom Environments. Sensors, 2025, 25(10): 3132-3132.

[25] Okano T M, Lopes C A W, Ruggero M S, et al. Edge AI for Industrial Visual Inspection: YOLOv8-Based Visual Conformity Detection Using Raspberry Pi. Algorithms, 2025, 18(8): 510-510.

[26] Thammasanya T, Patiam S, Rodcharoen E, et al. A new approach to classifying polymer type of microplastics based on Faster-RCNN-FPN and spectroscopic imagery under ultraviolet light. Scientific reports, 2024, 14(1): 3529-3529.

[27] Senussi F M, Kang S H. Occlusion Removal in Light-Field Images Using CSPDarknet53 and Bidirectional Feature Pyramid Network: A Multi-Scale Fusion-Based Approach. Applied Sciences, 2024, 14(20): 9332-9332.

[28] Chen S, Liu Y, Zhang H, et al. A human location and action recognition method based on improved Yolov11 model. Discover Artificial Intelligence, 2025, 5(1): 232-232.

[29] Wang Z, Yuan G, Zhou H, et al. Foreign-Object Detection in High-Voltage Transmission Line Based on Improved YOLOv8m. Applied Sciences, 2023, 13(23).

[30] Zhou L, Liu X, Guan X, et al. CSSA-YOLO: Cross-Scale Spatiotemporal Attention Network for Fine-Grained Behavior Recognition in Classroom Environments. Sensors, 2025, 25(10): 3132-3132.

[31] Nan Y, Niu W, Chang Y, et al. Transient Stability Assessment of Power Systems Built upon Attention-Based Spatial–Temporal Graph Convolutional Networks. Energies, 2025, 18(14): 3824-3824.

[32] Liu J, Lin C, Chen J, et al. Research on Real-Time Analysis and Intervention of Classroom Behaviour Based on Object Detection Algorithms. Advances in Vocational and Technical Education, 2025, 7(2). 

All published work is licensed under a Creative Commons Attribution 4.0 International License. sitemap
Copyright © 2017 - 2025 Science, Technology, Engineering and Mathematics.   All Rights Reserved.