A Deep Reinforcement Learning Based on Spatio-Temporal Model for Solving Weapon-Target Assignment
Zimo Zhu, Chuanqiang Yu, Junti Wang
Article
2026 / Volume 9 / Pages 4012-4033
Published 25 April 2026
Abstract
The weapon-target assignment (WTA) problem plays a pivotal role in modern combat command and control systems because it directly influences operational effectiveness. Conventional solution methods, such as exact algorithms and heuristic approaches, often perform poorly in dynamic battlefield environments due to their limited ability to capture evolving engagement states. Deep reinforcement learning (DRL) offers a promising paradigm for sequential decision-making under such conditions; however, existing DRL-based WTA methods frequently overlook the spatio-temporal dependencies among combat entities, thereby constraining their adaptability and decision accuracy. To address this issue, this paper proposes a deep reinforcement learning with spatio-temporal modeling (DRLSTM) framework for dynamic WTA. Specifically, the framework integrates a graph convolutional network (GCN) to encode inter-entity spatial dependencies and a gated recurrent unit (GRU) to model temporal state evolution. This spatio-temporal architecture enables the agent to learn context-aware assignment policies from dynamic battlefield information. The policy is optimized using an actor-critic learning scheme. Experimental results demonstrate that the proposed method outperforms conventional exact, heuristic, and representative DRL-based methods in both solution quality and computational efficiency. These results verify the effectiveness of the proposed framework for dynamic resource allocation and decision-making in complex battlefield environments.
Keywords
weapon-target assignment, deep reinforcement learning, spatio-temporal model, optimization methods