LAPSE:2023.9509
Published Article

LAPSE:2023.9509
Multiagent Reinforcement Learning Based on Fusion-Multiactor-Attention-Critic for Multiple-Unmanned-Aerial-Vehicle Navigation Control
February 27, 2023
Abstract
The proliferation of unmanned aerial vehicles (UAVs) has spawned a variety of intelligent services, where efficient coordination plays a significant role in increasing the effectiveness of cooperative execution. However, due to the limited operational time and range of UAVs, achieving highly efficient coordinated actions is difficult, particularly in unknown dynamic environments. This paper proposes a multiagent deep reinforcement learning (MADRL)-based fusion-multiactor-attention-critic (F-MAAC) model for multiple UAVs’ energy-efficient cooperative navigation control. The proposed model is built on the multiactor-attention-critic (MAAC) model, which offers two significant advances. The first is the sensor fusion layer, which enables the actor network to utilize all required sensor information effectively. Next, a layer that computes the dissimilarity weights of different agents is added to compensate for the information lost through the attention layer of the MAAC model. We utilize the UAV LDS (logistic delivery service) environment created by the Unity engine to train the proposed model and verify its energy efficiency. The feature that measures the total distance traveled by the UAVs is incorporated with the UAV LDS environment to validate the energy efficiency. To demonstrate the performance of the proposed model, the F-MAAC model is compared with several conventional reinforcement learning models with two use cases. First, we compare the F-MAAC model to the DDPG, MADDPG, and MAAC models based on the mean episode rewards for 20k episodes of training. The two top-performing models (F-MAAC and MAAC) are then chosen and retrained for 150k episodes. Our study determines the total amount of deliveries done within the same period and the total amount done within the same distance to represent energy efficiency. According to our simulation results, the F-MAAC model outperforms the MAAC model, making 38% more deliveries in 3000 time steps and 30% more deliveries per 1000 m of distance traveled.
The proliferation of unmanned aerial vehicles (UAVs) has spawned a variety of intelligent services, where efficient coordination plays a significant role in increasing the effectiveness of cooperative execution. However, due to the limited operational time and range of UAVs, achieving highly efficient coordinated actions is difficult, particularly in unknown dynamic environments. This paper proposes a multiagent deep reinforcement learning (MADRL)-based fusion-multiactor-attention-critic (F-MAAC) model for multiple UAVs’ energy-efficient cooperative navigation control. The proposed model is built on the multiactor-attention-critic (MAAC) model, which offers two significant advances. The first is the sensor fusion layer, which enables the actor network to utilize all required sensor information effectively. Next, a layer that computes the dissimilarity weights of different agents is added to compensate for the information lost through the attention layer of the MAAC model. We utilize the UAV LDS (logistic delivery service) environment created by the Unity engine to train the proposed model and verify its energy efficiency. The feature that measures the total distance traveled by the UAVs is incorporated with the UAV LDS environment to validate the energy efficiency. To demonstrate the performance of the proposed model, the F-MAAC model is compared with several conventional reinforcement learning models with two use cases. First, we compare the F-MAAC model to the DDPG, MADDPG, and MAAC models based on the mean episode rewards for 20k episodes of training. The two top-performing models (F-MAAC and MAAC) are then chosen and retrained for 150k episodes. Our study determines the total amount of deliveries done within the same period and the total amount done within the same distance to represent energy efficiency. According to our simulation results, the F-MAAC model outperforms the MAAC model, making 38% more deliveries in 3000 time steps and 30% more deliveries per 1000 m of distance traveled.
Record ID
Keywords
actor-attention-critic, air logistics, multiagent reinforcement learning, multiple UAV, sensor fusion
Subject
Suggested Citation
Jeon S, Lee H, Kaliappan VK, Nguyen TA, Jo H, Cho H, Min D. Multiagent Reinforcement Learning Based on Fusion-Multiactor-Attention-Critic for Multiple-Unmanned-Aerial-Vehicle Navigation Control. (2023). LAPSE:2023.9509
Author Affiliations
Jeon S: Department of Computer Science and Engineering, Konkuk University, Seoul 05029, Korea [ORCID]
Lee H: Department of Computer Science and Engineering, Konkuk University, Seoul 05029, Korea [ORCID]
Kaliappan VK: Konkuk Aerospace Design-Airworthiness Research Institute, Konkuk University, Seoul 05029, Korea [ORCID]
Nguyen TA: Konkuk Aerospace Design-Airworthiness Research Institute, Konkuk University, Seoul 05029, Korea [ORCID]
Jo H: Department of Computer Science and Engineering, Konkuk University, Seoul 05029, Korea
Cho H: Department of Computer Science and Engineering, Konkuk University, Seoul 05029, Korea
Min D: Department of Computer Science and Engineering, Konkuk University, Seoul 05029, Korea
Lee H: Department of Computer Science and Engineering, Konkuk University, Seoul 05029, Korea [ORCID]
Kaliappan VK: Konkuk Aerospace Design-Airworthiness Research Institute, Konkuk University, Seoul 05029, Korea [ORCID]
Nguyen TA: Konkuk Aerospace Design-Airworthiness Research Institute, Konkuk University, Seoul 05029, Korea [ORCID]
Jo H: Department of Computer Science and Engineering, Konkuk University, Seoul 05029, Korea
Cho H: Department of Computer Science and Engineering, Konkuk University, Seoul 05029, Korea
Min D: Department of Computer Science and Engineering, Konkuk University, Seoul 05029, Korea
Journal Name
Energies
Volume
15
Issue
19
First Page
7426
Year
2022
Publication Date
2022-10-10
ISSN
1996-1073
Version Comments
Original Submission
Other Meta
PII: en15197426, Publication Type: Journal Article
Record Map
Published Article

LAPSE:2023.9509
This Record
External Link

https://doi.org/10.3390/en15197426
Publisher Version
Download
Meta
Record Statistics
Record Views
181
Version History
[v1] (Original Submission)
Feb 27, 2023
Verified by curator on
Feb 27, 2023
This Version Number
v1
Citations
Most Recent
This Version
URL Here
https://psecommunity.org/LAPSE:2023.9509
Record Owner
Auto Uploader for LAPSE
Links to Related Works
