Movatterモバイル変換

201Accesses
Explore all metrics

Abstract

In this paper, a UAV swarm engagement scenario is considered, where the defender swarm tries to intercept the attacker swarm cooperatively to prevent it from entering into the target area. Different from the previous attack–defense strategy on deep reinforcement learning, we remove the assumption that the swarms have perfect knowledge of situation information of both sides and the environment. On this basis, a double experience pool strategic interaction Q-learning (DEP-SIQ) swarm attack and defense algorithm is proposed, which makes the dimension of the network input reduced and the UAV with the same task use the same network. The algorithm sets up different experience pools for the attacker and the defender, respectively. During training, both sides take samples from their own experience pool to train their own network. The estimation reward of the UAV is decomposed into the sum of the interaction reward values with other friendly UAVs, which is effectively suitable for large-scale swarm. Simulation experiments shows the feasibility of the proposed algorithm. Compare with other algorithms, the proposed DEP-SIQ algorithm has a faster learning efficiency, higher win rate and better applicability.

This is a preview of subscription content,log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

A New Deep Reinforcement Learning Algorithm for UAV Swarm Confrontation Game

UAV Swarm Attack-Defense Confrontation Based on Multi-agent Reinforcement Learning

A Multi-agent Deep Reinforcement Learning Method for UAVs Cooperative Pursuit Problem

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

All relevant data are within the paper.

References

Duan HB, Huo MZ, Fan YM (2018) Flight verification of multiple UAVs collaborative air combat imitating the intelligent behavior in hawks. Control Theory Appl.https://doi.org/10.7641/CTA.2018.80433. (Chinese)
Article Google Scholar
Gupta JK, Egorov M, Kochenderfer M (2017) Cooperative multi-agent control using deep reinforcement learning. In: Autonomous agents and multiagent systems: AAMAS 2017 workshops, best papers, São Paulo, Brazil, May 8–12, 2017, Revised Selected Papers 16, pp 66–83. Springer International Publishing.https://doi.org/10.1007/978-3-319-71682-4_5
Jia YN, Tian SY, Li Q (2020) Recent development of unmanned aerial vehicle swarms. J Aeronaut.https://doi.org/10.7527/S10006893.2019.23738. (Chinese)
Article Google Scholar
Kouzeghar M, Song Y, Meghjani M, Bouffanais R (2023) Multi-target pursuit by a decentralized heterogeneous UAV swarm using deep multi-agent reinforcement learning. arxiv preprinthttp://arxiv.org/abs/2303.01799.https://doi.org/10.48550/arXiv.2303.01799
Lamont GB, Slear JN, Melendez K (2007) UAV swarm mission planning and routing using multi-objective evolutionary algorithms. In: 2007 IEEE symposium on computational intelligence in multi-criteria decision-making. IEEE, pp 10–20.https://doi.org/10.1109/MCDM.2007.369410
Luo DL, Zhang HY, Xie RZ, Wu SX (2015) Unmanned aerial vehicles swarm conflict based on multi-agent system. Control Theory Appl.https://doi.org/10.1360/zf2011-41-5562. (Chinese)
Article Google Scholar
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533.https://doi.org/10.1038/nature14236
Article Google Scholar
Nowak DJ, Price I, Lamont GB (2007) Self organized UAV swarm planning optimization for search and destroy using SWARMFARE simulation. In: 2007 winter simulation conference. IEEE, pp 1315–1323.https://doi.org/10.1109/WSC.2007.4419738
Shuprajhaa T, Sujit SK, Srinivasan K (2022) Reinforcement learning based adaptive PID controller design for control of linear/nonlinear unstable processes. Appl Soft Comput 128:109450.https://doi.org/10.1016/j.asoc.2022.109450
Article Google Scholar
Sunehag P, Lever G, Gruslys A, Czarnecki WM, Zambaldi V, Jaderberg M et al (2017) Value-decomposition networks for cooperative multi-agent learning. arxiv preprinthttp://arxiv.org/abs/1706.05296.https://doi.org/10.48550/arXiv.1706.05296
Tampuu A, Matiisen T, Kodelja D, Kuzovkin I, Korjus K, Aru J et al (2017) Multiagent cooperation and competition with deep reinforcement learning. PLoS ONE 12(4):e0172395.https://doi.org/10.1371/journal.pone.0172395
Article Google Scholar
Wang B, Li S, Gao X, Xie T (2021) UAV swarm confrontation using hierarchical multiagent reinforcement learning. Int J Aerosp Eng 2021:1–12.https://doi.org/10.1155/2021/3360116
Article Google Scholar
Xiang L, Xie T (2020) Research on UAV swarm confrontation task based on MADDPG algorithm. In: 2020 5th international conference on mechanical, control and computer engineering (ICMCCE). IEEE, pp 1513–1518.https://doi.org/10.1109/icmcce51767.2020.00332
Xiong J, Wang Q, Yang Z, Sun P, Han L, Zheng Y et al (2018) Parametrized deep q-networks learning: reinforcement learning with discrete-continuous hybrid action space.https://doi.org/10.48550/arXiv.1810.06394. arxiv preprinthttp://arxiv.org/abs/1810.06394
Yang Y, Luo R, Li M, Zhou M, Zhang W, Wang J (2018) Mean field multi-agent reinforcement learning. In: International conference on machine learning, pp 5571–5580. PMLR
Yu C, Velu A, Vinitsky E, Gao J, Wang Y, Bayen A, Wu Y (2022) The surprising effectiveness of PPO in cooperative multi-agent games. Adv Neural Inf Process Syst 35:24611–24624
Google Scholar
Zhang K, Yang Z, Liu H, Zhang T, Basar T (2018) Fully decentralized multi-agent reinforcement learning with networked agents. In: International conference on machine learning. PMLR, pp 5872–5881.https://doi.org/10.1631/FITEE.1900661
Zhang L, Yu X, Zhang S (2021) Research on collaborative and confrontation of UAV swarms based on SAC-OD rules. In: Proceedings of the 4th international conference on information management and management science, pp 273–278.https://doi.org/10.1145/3485190.3485232
Zheng Z, Ruan L, Zhu M, Guo X (2020) Reinforcement learning control for underactuated surface vessel with output error constraints and uncertainties. Neurocomputing 399:479–490.https://doi.org/10.1016/j.neucom.2020.03.021
Article Google Scholar
Zhou Y, Liu Z, Shi H, Li S, Ning N, Liu F, Gao X (2023) Cooperative multi-agent target searching: a deep reinforcement learning approach based on parallel hindsight experience replay. Complex Intell Syst.https://doi.org/10.1007/S40747-023-00985-W
Article Google Scholar
Zhu J, Fu X, Qiao Z (2022) UAVs Maneuver decision-making method based on transfer reinforcement learning. Comput Intell Neurosci.https://doi.org/10.1155/2022/2399796
Article Google Scholar

Download references

Funding

This work was supported in part by the Aeronautical Science Foundation of China under Grant 2020Z023053001.

Author information

Authors and Affiliations

School of Electronics and Information, Northwestern Polytechnical University, Xi’an, China
Xiaowei Fu, Zhe Qiao & Zhe Xu

Authors

Xiaowei Fu
View author publications
You can also search for this author inPubMed Google Scholar
Zhe Qiao
View author publications
You can also search for this author inPubMed Google Scholar
Zhe Xu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toXiaowei Fu.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fu, X., Qiao, Z. & Xu, Z. Attack–defense strategy of UAV swarm based on DEP-SIQ in the active target defense scenario.Soft Comput28, 10463–10473 (2024). https://doi.org/10.1007/s00500-024-09826-5

Download citation

Accepted:12 March 2024
Published:20 July 2024
Issue Date:September 2024
DOI:https://doi.org/10.1007/s00500-024-09826-5

Movatterモバイル変換

Attack–defense strategy of UAV swarm based on DEP-SIQ in the active target defense scenario

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A New Deep Reinforcement Learning Algorithm for UAV Swarm Confrontation Game

UAV Swarm Attack-Defense Confrontation Based on Multi-agent Reinforcement Learning

A Multi-agent Deep Reinforcement Learning Method for UAVs Cooperative Pursuit Problem

Explore related subjects

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Access this article

Subscribe and save

Buy Now