Adaptive Production Capacity Planning Under Variable Electricity Cost Using Deep Reinforcement Learning
Keywords:
Deep reinforcement learning, production capacity planning, variable electricity cost, asynchronous advantage actor-critic, proximal policy optimizationAbstract
Reinforcement learning is gaining traction for its ability to solve complex tasks that are intractable or impossible for other machine learning techniques. This paper proposes a novel approximation technique for production capacity and inventory planning using deep reinforcement learning (DRL). To address practical implementation challenges, we incorporate demand uncertainty and time-of-use electricity price-driven demand response patterns (PDDR) into the model. We compare the performance of two DRL techniques, A3C and PPO, in learning to optimize production planning over time to minimize total cost. The Discrete-Time MILP with new changeover constraint equations was formulated to take the model's optimal solution as an upper benchmark. Our results show that the PPO outperforms the A3C and expert heuristics with an optimality gap of 4.03% compared to MILP, and its simulation time is 2,502 times faster than that of MILP. Furthermore, our findings suggest that PPO is more robust regarding demand fluctuations than A3C due to its objective clipping mechanism stabilizing policy updates. This makes our PPO-based production planning model a promising candidate for real-world applications where demand fluctuations are common.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2025 International Journal of Integrated Engineering

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Open access licenses
Open Access is by licensing the content with a Creative Commons (CC) license.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.










