Comparative Analysis of Loss Functions in TD3 for Autonomous Parking
Keywords:
Autonomous Parking, Reinforcement LearningAbstract
Autonomous parking is a revolutionary technology that has transformed the automotive industry with the rise of deep reinforcement learning, in particular, the Twin-Delayed Deep Deterministic Policy Gradient Algorithm (TD3). Nonetheless, the robustness of TD3 remains a significant challenge due to bias in Q-value estimates when determining how good an Action, A, taken at a particular state, S. To investigate this gap, this paper analyzes different loss functions in TD3 to better approximate the true Q-value, which is necessary for optimal decision making. Three loss functions are evaluated; Mean Squared Error (MSE), Mean Absolute Error (MAE) and Huber Loss via a simulation experiment for autonomous parking. The results showed that TD3 with Huber Loss has the highest convergence speed with the fastest Actor and Critic loss convergence. The Huber Loss function is found to be more robust and efficient than either loss function such MSE or MAE used in isolation, making it a suitable replacement for existing loss functions in the TD3 algorithm. In the future, TD3 with Huber Loss will be used as the base model to solve overestimation problem in TD3 when the estimated Q-values that represent the expected rewards of taking an action in a particular state, are higher than their true values.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Journal of Soft Computing and Data Mining
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.