APPLICATION OF REINFORCEMENT LEARNING IN AUTONOMOUS DRIVING SCENARIOS: PATH PLANNING USING POLICY GRADIENT METHODS-Upubscience Publisher

APPLICATION OF REINFORCEMENT LEARNING IN AUTONOMOUS DRIVING SCENARIOS: PATH PLANNING USING POLICY GRADIENT METHODS

Download as PDF

Volume 3, Issue 3, Pp 24-30, 2025

DOI: https://doi.org/10.61784/wjer3035

Author(s)

JunRan Wu

Affiliation(s)

School of Mathematical, Chengdu University of Technology, Yibin 644000, Sichuan, China.

Corresponding Author

JunRan Wu

ABSTRACT

This study explores reinforcement learning-based autonomous path planning in China's mixed traffic flow, employing a policy gradient approach to optimize decision-making through environmental interaction. China's traffic environment, characterized by a diverse mix of vehicles, pedestrians, and non-motorized traffic, presents unique challenges. To address these, a state space is meticulously constructed, incorporating real-time traffic light status, the precise positions and orientations of surrounding vehicles, providing a comprehensive representation of the traffic scenario. The action space, covering operations such as going straight, turning left, turning right, and stopping, enables the autonomous system to make practical driving decisions. A "time-saving and violation-avoidance" reward function is designed, effectively transforming the path planning task into a sequential decision optimization problem. The policy network, parameterized by a three-layer neural network, learns from the environment through repeated trials. Initial reward fluctuations gradually stabilize, achieving a 100% success rate with an average of 18.2 steps, successfully realizing shortest-path selection. However, the negative average reward (-1.22) indicates that the current reward function may have an excessive penalty bias. While validating the effectiveness of the policy gradient method for intelligent transportation, the results underscore the necessity of refining the reward function to strike a balance between negative penalties and positive incentives. This framework provides valuable methodological guidance for autonomous driving in complex scenarios, yet further optimization of the reward mechanisms remains essential for practical implementation.

KEYWORDS

Autonomous driving; Policy gradient methods; Reward function; Policy network

CITE THIS PAPER

JunRan Wu. Application of reinforcement learning in autonomous driving scenarios: path planning using policy gradient methods. World Journal of Engineering Research. 2025, 3(3): 24-30. DOI: https://doi.org/10.61784/wjer3035

REFERENCES

[1] Piero Scaruffi. The Nature of Intelligence: 64 Big Questions in the Fields of Artificial Intelligence and Robotics. Beijing: People's Posts and Telecommunications Press, 2018.

[2] Sun Changyin, Mu Chaoxu. Several Key Scientific Issues of Multi-Agent Deep Reinforcement Learning. Acta Automatica Sinica, 2020, 46(7): 1301-1312.

[3] Dong Hao, Yang Jing, Li Shaobo, et al. Research Progress of Robot Motion Control Based on Deep Reinforcement Learning. Control and Decision Making, 2022, 37(2): 278-292.

[4] Li Kaiwen, Zhang Tao, Wang Rui, et al. Research Progress of Combinatorial Optimization Based on Deep Reinforcement Learning. Acta Automatica Sinica, 2021, 47(11): 2521-2537.

[5] Xiong Luolin, Mao Shuai, Tang Yang, et al. A Survey of Integrated Energy System Management Based on Reinforcement Learning. Acta Automatica Sinica, 2021, 47(10): 2321-2340.

[6] Yang Ting, Zhao Liyuan, Liu Yachuang, et al. Dynamic Economic Dispatch of Integrated Energy System Based on Deep Reinforcement Learning. Automation of Electric Power Systems, 2021, 45(5): 39-47.

[7] Sutton R S, McAllester D, Singh S, et al. Policy gradient methods for reinforcement learning with function approximation. Advances in neural information processing systems, 1999, 12.

[8] Chen Jiapan, Zheng Minhua. A Survey of Robot Manipulation Behavior Research Based on Deep Reinforcement Learning. Robot, 2022, 44(2): 236-256.

[9] Wang Han, Yu Yang, Jiang Yuan. A Survey of Advances in Multi-Agent Reinforcement Learning Based on Communication. Scientia Sinica (Informationis), 2022, 52(5): 742-764.

[10] Liu Hongqing, Wang Shimin. Research on Vehicle Routing Problem Based on Reinforcement Learning. Computer Applications and Software, 2021, 38(8): 303-308.

[11] Zhang Rongxia, Wu Changxu, Sun Tongchao, et al. Research Progress of Deep Reinforcement Learning and Its Application in Path Planning. Journal of Computer Engineering & Applications, 2021, 57(19).

[12] Huang Dongjin, Jiang Chenfeng, Han Kaili. Three-Dimensional Path Planning Algorithm Based on Deep Reinforcement Learning. Journal of Computer Engineering & Applications, 2020, 56(15).

[13] Xu Hongxin, Wu Zhizhou, Liang Yunyi. A Review of Research on Path Planning Methods for Autonomous Driving Vehicles Based on Reinforcement Learning. Application Research of Computers/Jisuanji Yingyong Yanjiu, 2023, 40(11).

[14] Zhu Maofei, Hu Fangya, Li Nake, Zhu Shouli, Wu Qiong. A Survey of Path Planning Algorithms for Driverless Vehicles. Agricultural Equipment & Vehicle Engineering, 2023, 61(11): 18-22.