Movatterモバイル変換


[0]ホーム

URL:


CN116880186B - A data-driven adaptive dynamic programming method for air combat decision making - Google Patents

A data-driven adaptive dynamic programming method for air combat decision making
Download PDF

Info

Publication number
CN116880186B
CN116880186BCN202310861633.0ACN202310861633ACN116880186BCN 116880186 BCN116880186 BCN 116880186BCN 202310861633 ACN202310861633 ACN 202310861633ACN 116880186 BCN116880186 BCN 116880186B
Authority
CN
China
Prior art keywords
unmanned aerial
aerial vehicle
red
party
blue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310861633.0A
Other languages
Chinese (zh)
Other versions
CN116880186A (en
Inventor
李彬
宁召柯
史明明
李清亮
陶呈纲
孙绍山
李导
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan UniversityfiledCriticalSichuan University
Priority to CN202310861633.0ApriorityCriticalpatent/CN116880186B/en
Publication of CN116880186ApublicationCriticalpatent/CN116880186A/en
Application grantedgrantedCritical
Publication of CN116880186BpublicationCriticalpatent/CN116880186B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本发明公开了一种数据驱动的自适应动态规划空战决策方法,包括以下步骤:S1,建立无人机追逃问题系统模型;S2,采用无模型自适应动态规划求解上述无人机追逃问题;采用离线神经网络模型训练算法获得红方无人机和蓝方无人机实时控制率,并实时收集红方控制率信息和红蓝双方状态信息;S4,通过在线模型训练算法在线更新神经网络,实现红方无人机和蓝方无人机在“追踪‑逃逸”问题中的自适应动态规划的空战决策。本发明结合了离线训练和在线训练的优点,提升了本发明的在线自适应调整策略的能力。且本发明不依赖于飞行器系统模型,具有很强的泛化能力,可以推广的多个应用场景。

The present invention discloses a data-driven adaptive dynamic programming air combat decision-making method, comprising the following steps: S1, establishing a system model for the UAV pursuit and escape problem; S2, using model-free adaptive dynamic programming to solve the above-mentioned UAV pursuit and escape problem; using an offline neural network model training algorithm to obtain the real-time control rate of the red party UAV and the blue party UAV, and collecting the red party control rate information and the red and blue party status information in real time; S4, updating the neural network online through an online model training algorithm to realize the adaptive dynamic programming air combat decision of the red party UAV and the blue party UAV in the "pursuit-escape" problem. The present invention combines the advantages of offline training and online training, and improves the ability of the online adaptive adjustment strategy of the present invention. In addition, the present invention does not rely on the aircraft system model, has a strong generalization ability, and can be promoted to multiple application scenarios.

Description

Data-driven self-adaptive dynamic programming air combat decision method
Technical Field
The invention belongs to the technical field of unmanned aerial vehicles, and particularly relates to a data-driven self-adaptive dynamic planning air combat decision method.
Background
The unmanned air fighter decision is aimed at making it take advantage of or turn down in fight, and the key of research is to design an efficient autonomous decision mechanism. The autonomous decision of the unmanned flight fighter is a mechanism for making a tactical plan or selecting a flight action in real time according to the actual combat environment in the air combat, and the degree of the decision mechanism reflects the intelligent level of the unmanned flight fighter in the modern air combat. The input of the autonomous decision mechanism is various parameters related to air combat, such as flight parameters of an aircraft, weapon parameters, three-dimensional space scene parameters and the relative relationship of two parties of a friend and foe, the decision process is information processing and calculation decision process inside a system, and the output is a tactical plan or some specific flight actions of decision making.
The self-adaptive dynamic programming integrates the ideas of dynamic programming and reinforcement learning, not only inherits the advantages of a dynamic programming method, but also can overcome the problem of dimension disaster generated by dynamic programming. The principle of the self-adaptive dynamic programming is to approximate the performance function and the control strategy in the traditional dynamic programming by adopting a function approximation structure, and obtain an optimal value function and the control strategy by means of the idea of reinforcement learning so as to meet the principle of the Belman optimality. The idea of adaptive dynamic programming can be represented by fig. 1.
Air combat decision making is a complex task involving a large amount of information and variables, making conventional manually-made decision rules difficult to adapt to changing battlefield environments. Therefore, the existing air combat decision method often has the following problems:
1. The static planning method cannot cope with the dynamic environment: conventional decision methods are generally based on preset rules or models, and are difficult to adapt to the battlefield environment and dynamic enemy conditions which change in real time.
2. The manual decision requires a lot of time and effort: the decision process needs to process a large amount of information and variables, consumes a large amount of time and effort, and is also prone to omission and misjudgment.
3. Lack of comprehensive consideration and flexible strain capacity: conventional decision methods typically make decisions based on a single factor or a small number of factors, and it is difficult to comprehensively consider and flexibly strain multiple factors, which may lead to decision bias or inaccuracy.
4. Cannot accommodate the needs of informationized warfare: the modern air combat environment has large information quantity and rapid change, and the traditional method for manually making decision rules cannot adapt to the requirements of informationized war.
Disclosure of Invention
The invention aims to provide a data-driven self-adaptive dynamic planning air combat decision-making method, which mainly solves the problem that the traditional manual decision-making rule is difficult to adapt to continuously-changing battlefield environments.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
a data-driven self-adaptive dynamic programming air combat decision method comprises the following steps:
S1, assuming that the fighter unmanned aerial vehicle is a red unmanned aerial vehicle and a blue unmanned aerial vehicle; respectively establishing an unmanned aerial vehicle escape problem system model according to red side pursuit-blue Fang Taoyi and red side escape-blue side pursuit problems;
S2, solving the unmanned aerial vehicle escape problem by adopting model-free self-adaptive dynamic programming, and improving a strategy by adopting a bounded exploration signal;
s3, acquiring real-time control rates of the red unmanned aerial vehicle and the blue unmanned aerial vehicle by adopting an offline neural network model training algorithm, and collecting state information of the red unmanned aerial vehicle and the blue unmanned aerial vehicle in real time;
And S4, updating the neural network on line through an on-line model training algorithm to realize the air combat decision of the self-adaptive dynamic programming of the red unmanned aerial vehicle and the blue unmanned aerial vehicle in the tracking-escaping problem.
Further, in the invention, the method for establishing the red party pursuit-blue party escape problem model is as follows:
The real-time position of the red-side unmanned aerial vehicle is Xr (t), the position of the blue-side unmanned aerial vehicle is Xb (t), and then the position difference of the two sides is as follows:
e=Xb(t)-Xr(t) (1)
the tracking error system is:
Wherein represents the differential of the tracking error with respect to time of the two-party position difference e,/> represents the differential of the real-time position Xb (t) of the blue-party unmanned aerial vehicle with respect to time, and/> represents the differential of the real-time position Xr (t) of the red-party unmanned aerial vehicle with respect to time;
assuming that the red party chaser can only measure the three-dimensional movement speed of the blue party unmanned aerial vehicle, the expression (2) can be expressed specifically as:
The system model of the red party pursuing the blue party is expressed as:
Vr is the speed of the red unmanned aerial vehicle, mach is the unit, is the differential of the speed Vr of the red unmanned aerial vehicle with respect to time; χr is the heading angle of the red unmanned aerial vehicle, the unit is radian, and/() is the differentiation of the heading angle χr of the red unmanned aerial vehicle with respect to time; gammar is the track inclination angle of the red unmanned aerial vehicle, the unit is radian, and/() is the derivative of the track inclination angle of the red unmanned aerial vehicle with respect to time; distance error ex,ey,ez in km,/> is the derivative of distance error with respect to ex,ey,ez time; g is gravity acceleration; vc is sound velocity, and nx,ny,nz is overload control quantity of the red unmanned aerial vehicle.
Further, in the invention, the method for establishing the red party escape-blue party pursuit problem model is as follows:
The virtual displacement method is adopted to minimize the distance between the local reverse displacement and the enemy aircraft, namely, the effect of maximizing the local position and the enemy aircraft position is achieved, wherein the virtual displacement is the displacement quantity generated by the virtual displacement speed V', namely:
The system model of red-side escape-blue-side chase is expressed as:
Wherein the sign and meaning are the same as the pursuit problem.
Further, in the invention, the red party pursuit-blue party escape pursuit problem system model is processed, which comprises the following steps:
s11, the nonlinear continuous state space equation of the unmanned aerial vehicle is abbreviated as:
Wherein x= [ Vrrr,ex,ey,ez]T ] represents a red aircraft state vector, represents a differential of the red aircraft state vector x with respect to time, u= [ nx,ny,nz]T ] represents a red aircraft control vector, F (x), G (x) are respectively
S12, defining a performance index function as:
Wherein Q (x, t) is an index function related to the state, and R (u, t) is an index function related to the control amount;
s13, establishing an angle dominance function of the unmanned aerial vehicle, and setting a speed vector of the unmanned aerial vehicle of the red party as follows:
V=[cosγrcosχr,cosγrsinχr,sinγr]T
the blue-side unmanned aerial vehicle speed vector is:
Vb=[cosγbcosχb,cosγbsinχb,sinγb]T
The distance vector of the red unmanned aerial vehicle to the blue Fang Lanfang unmanned aerial vehicle is erb=[ex,ey,ez]T, and the geometric relationship is that
Obtaining an angle dominance function:
Qα=cαr+(1-c)αb (9)
wherein c= (αrb)/2pi;
s14, defining a distance dominance function as follows:
Qd=eTQ1e (10)
wherein e= [ ex,ey,ez]T, is a positive definite matrix;
The state index function of the red party can be expressed as:
Wherein Q2 is a weight coefficient;
S15, defining a controller index function as:
R(u,t)=(u-u0)TR(u-u0) (12)
Wherein is a control quantity weight coefficient, and u0=[sinγr,0,cosγr]T is a control quantity of the unmanned aerial vehicle under stable flight.
Further, the specific implementation method of the step S2 in the present invention is as follows:
Defining a bounded exploration signal ue, and rewrites a system model type (5) of the red unmanned aerial vehicle as follows:
The performance index function is:
The derivative of the performance index function (7) with respect to time is expressed as:
When the performance index is the minimum value calculated by the functional formula (7), the following Belman equation is satisfied:
Wherein R(j) = Q (x, t) +r (u, t); by combining the formula (17) and the formula (18), it is possible to obtain:
The optimal control quantity of the real system is as follows:
The formula (20) is used to solve G, and the formula (19) is used to obtain:
integrating the two ends of the formula (21) from t0 to t to obtain:
a neural network is employed to approximate the cost function and control inputs, namely:
Wherein is the ideal neural network weight for the evaluation network and the execution network, respectively; l1,L2 is the number of hidden layer neurons of the evaluation network and the execution network, respectively; the/> is the neural network activation function of the evaluation network and the execution network, respectively; the/> is the reconstruction error of the evaluation network and the execution network, respectively;
let the evaluation network and the evaluation network execute the estimated values of the network as follows:
Wherein is the estimated value of the ideal neural network weight Wc,Wa, respectively; substituting equation (24) into equation (22) yields the following residual term errors:
wherein is the control amount obtained by improving the strategy, and the expression is:
Wherein Ω is the exploration set of control quantities, obtained by adding a bounded random exploration signal, and/> is optimized/> by least squares algorithm, namely:
optimizing by the least squares algorithm is:
Further, in step S3 in the present invention, the offline neural network model training algorithm includes the steps of:
S31: by giving different initial states, a data set { sk(t0) } can be obtained, and initialized
S32: obtaining control quantity corresponding to the state, namely data set according to the formula (26)
S33: update get/> update get/> according to equation (28) using dataset update according to equation (27)
S34: terminating the algorithm if or/>; otherwise j=j+1, step S32 is skipped, where ea、∈c is convergence accuracy.
Further, in step S4, the step of online updating the neural network by the online model training algorithm is as follows:
S41: the current neural network separation weight is Wc,Wa, the online learning rate is alpha, a real-time data set { x (t), u (t) } is obtained by sampling at a fixed time interval δt, and a step S42 is carried out after a plurality of groups of data are acquired;
S42: obtaining control quantity corresponding to the state, namely data set according to the formula (26)
S43: calculation according to equation (27) using dataset/> calculation according to equation (28)
S44: the online updating of the neural network weights,, jumps to step S41.
Drawings
Fig. 1 is a diagram of a prior art adaptive dynamic programming architecture.
FIG. 2 is a flow chart of the present invention.
Fig. 3 is a schematic view of the angular advantage of the unmanned aerial vehicle according to the embodiment of the present invention.
Fig. 4 is a schematic diagram of virtual displacement principle in an embodiment of the present invention.
Detailed Description
The invention will be further illustrated by the following description and examples, which include but are not limited to the following examples.
As shown in fig. 2, in the data-driven adaptive dynamic programming air combat decision method disclosed by the invention, in a tracking-escaping scene, one tracker and one escaper exist, and in this embodiment, both sides of the tracking and escaping are represented by both red and blue sides. This problem is described herein in terms of red party chase, blue party escape, and red party unmanned aerial vehicle reduces the distance with blue party unmanned aerial vehicle through maneuver, avoids simultaneously being caught by blue party unmanned aerial vehicle, avoids being aimed at by blue party unmanned aerial vehicle aircraft nose promptly to be in the inferior situation.
In the embodiment, an unmanned aerial vehicle escape problem system model is built according to the red party pursuit-blue Fang Taoyi and the red party escape-blue party pursuit problems respectively.
First, note red square unmanned aerial vehicle real-time position is Xr (t), and blue square unmanned aerial vehicle position is Xb (t), and then both sides position difference is:
e=Xb(t)-Xr(t) (1)
the tracking error system is:
Wherein represents the differential of the tracking error with respect to time of the two-party position difference e,/> represents the differential of the real-time position Xb (t) of the blue-party unmanned aerial vehicle with respect to time, and/> represents the differential of the real-time position Xr (t) of the red-party unmanned aerial vehicle with respect to time;
assuming that the red party chaser can only measure the three-dimensional movement speed of the blue party unmanned aerial vehicle, the expression (2) can be expressed specifically as:
The system model of the red party pursuing the blue party is expressed as:
Vr is the speed of the red unmanned aerial vehicle, mach is the unit, is the differential of the speed Vr of the red unmanned aerial vehicle with respect to time; χr is the heading angle of the red unmanned aerial vehicle, the unit is radian, and/() is the differentiation of the heading angle χr of the red unmanned aerial vehicle with respect to time; gammar is the track inclination angle of the red unmanned aerial vehicle, the unit is radian, and/() is the derivative of the track inclination angle of the red unmanned aerial vehicle with respect to time; distance error ex,ey,ez in km,/> is the derivative of distance error with respect to ex,ey,ez time; g is gravity acceleration; vc is the speed of sound, nx,ny,nz is the overload control amount of the red unmanned aerial vehicle, and the overload is normally saturated.
For convenience of description, the nonlinear continuous state space equation of the unmanned aerial vehicle is abbreviated as
Where x= [ Vγγγ,ex,ey,ez ] T represents a red aircraft state vector, represents a differential of the red aircraft state vector x with respect to time, u= [ nx,ny,nz ] represents a red aircraft control vector, F (x), G (x) are respectively:
Because the unmanned aerial vehicle chases after the escape problem is a nonlinear optimal control problem with a saturated actuator, the performance index function is defined as follows:
Wherein Q (x, t) is an index function related to the state, and R (ut) is an index function related to the control amount;
s13, establishing an angle dominance function of the unmanned aerial vehicle, and setting a speed vector of the unmanned aerial vehicle of the red party as follows:
Vr=[cosγγcosχγ,cosγγ,sinχγ,sinγγ]T
the blue-side unmanned aerial vehicle speed vector is:
Vb=[cosγbcosχb,cosγbsinχb,sinγb]T
the distance vector of the red unmanned aerial vehicle to the blue Fang Lanfang unmanned aerial vehicle is erb=[ex,ey,ez]T, as shown in fig. 3, and the geometric relationship is as follows:
During air combat, it is desirable that αγb be as small as possible to achieve a red square angle situation predominance. Taking a red square as an example, when alphar-(π-αb) is less than 0, namely alpharb is less than pi, the attack angle of the red square is dominant; conversely, if αrb > pi, the red attack angle is at a disadvantage; when αrb =pi, the red attack angle is at the equilibrium. Setting an angle dominance function:
Qa=cαr+(1-c)αb (9)
Wherein c= (αrb)/2pi; the optimization level relation of the angle alpharb can be dynamically adjusted through the weight c, when c is smaller than 0.5, the attack angle of the red party is dominant, alphab is optimized in a key way, and the blue party is prevented from obtaining the dominant angle situation; when c > 0.5, the attack angle of the red party is at a disadvantage, and alphar should be optimized with emphasis so that the red party obtains a dominant angle situation.
In the tracking problem, the goal of the red party is to shorten the distance to the blue party, thus defining a distance dominance function as:
Qd=eTQle (10)
wherein e= [ ex,ey,ez]T, is a positive definite matrix;
The state index function of the red party can be expressed as:
Q(x,t)=Qd+Q2Qα (11)
Wherein Q2 is a weight coefficient;
In order to meet the control limitation requirement, the unmanned aerial vehicle is stable in a stable flight state, and the controller index function is defined as follows:
R(u,t)=(u-u0)TR(u-u0) (12)
Wherein is a control quantity weight coefficient, and u0=[sinγr,0,cosγr]T is a control quantity of the unmanned aerial vehicle under stable flight.
For the red-side escape-blue-side pursuit problem model establishment, the escape problem is different from the pursuit problem in that the objective function is opposite to the pursuit problem, so as to maximize the double-machine distance. Meanwhile, in order to avoid the missile, when the distance between the unmanned aerial vehicle and the missile is smaller, the unmanned aerial vehicle needs to change the course and the climbing angle in a large maneuver, so as to avoid the missile. In order to solve the problem of maximizing the distance between the two aircraft, a virtual displacement method is adopted to minimize the distance between the reverse displacement of the aircraft and the enemy aircraft, namely, the effect of maximizing the position of the aircraft and the position of the enemy aircraft is achieved.
As shown in fig. 4, the host is chased by the enemy machine, and the distance between the host and the enemy machine is intended to be maximized, and the distance between the virtual displacement and the enemy machine is minimized for a virtual displacement speed V' with the opposite direction of the host speed vector V. The virtual displacement is the displacement amount generated by V', namely:
The system model of red-side escape-blue-side chase is expressed as:
Wherein the sign and meaning are the same as the pursuit problem.
Generally, an accurate unmanned aerial vehicle system model cannot be obtained in actual operation, but the existing model-free adaptive dynamic programming based on data is seriously dependent on the data, and policy improvement cannot be performed on the basis of the existing data. Therefore, the embodiment adopts model-free self-adaptive dynamic programming to solve the escape problem of the unmanned aerial vehicle, and adopts bounded exploration signals to improve strategies.
Defining a bounded exploration signal ue, and rewrites a system model type (5) of the red unmanned aerial vehicle as follows:
The performance index function is:
The derivative of the performance index function (7) with respect to time is expressed as:
When the performance index function (16) is a minimum value, the following bellman equation is satisfied:
Wherein R(j) = Q (x, t) +r (u, t); by combining the formula (17) and the formula (18), it is possible to obtain:
The optimal control quantity of the real system is as follows:
The formula (20) is used to solve G, and the formula (19) is used to obtain:
integrating the two ends of the formula (21) from t0 to t to obtain:
a neural network is employed to approximate the cost function and control inputs, namely:
Wherein is the ideal neural network weight for the evaluation network and the execution network, respectively; l1,L2 is the number of hidden layer neurons of the evaluation network and the execution network, respectively; the/> is the neural network activation function of the evaluation network and the execution network, respectively; and/> is the reconstruction error of the evaluation network and the execution network, respectively.
Let the evaluation network and the evaluation network execute the estimated values of the network as follows:
wherein is the estimated value of the ideal neural network weight Wc,Wa, respectively; substituting equation (24) into equation (22) yields the following residual term errors:
Wherein is the control amount obtained by improving the strategy, and the expression is:
Wherein Ω is the exploration set of control quantities, obtained by adding a bounded random exploration signal, and/> is optimized/> by least squares algorithm, namely:
Optimizing by the least squares algorithm is:
In the embodiment, an offline neural network model training algorithm is adopted to obtain the real-time control rate of the red unmanned aerial vehicle and the blue unmanned aerial vehicle, and the information of the red control rate and the state information of the red and blue unmanned aerial vehicles are collected in real time. The method specifically comprises the following steps:
S31: by giving different initial states, a data set { xk(t0) } can be obtained, and initialized
S32: obtaining control quantity corresponding to the state, namely data set according to the formula (26)
S33: update get/> update get/> according to equation (28) using dataset update according to equation (27)
S34: terminating the algorithm if or/>; otherwise j=j+1, step S32 is skipped, where ea、∈c is convergence accuracy.
In the embodiment, the neural network is updated on line through an on-line model training algorithm at intervals, so that the air combat decision of self-adaptive dynamic programming of the red unmanned aerial vehicle and the blue unmanned aerial vehicle in the tracking-escaping problem is realized. The method specifically comprises the following steps:
S41: the current neural network separation weight is Wc,Wa, the online learning rate is alpha, a real-time data set { x (t), u (t) } is obtained by sampling at a fixed time interval δt, and a step S42 is carried out after a plurality of groups of data are acquired;
s42: obtaining control quantity corresponding to the state, namely data set according to the formula (26)
S43: calculation according to equation (27) using dataset/> calculation according to equation (28)
S44: the online updating of the neural network weights,, jumps to step S41.
Through the method, the capacity of the online self-adaptive adjustment strategy is improved, and the adaptability of the unmanned aerial vehicle air combat decision in different scenes is improved. The method does not depend on an aircraft system model, has strong generalization capability, and can be popularized to the control technology of other equipment, such as a plurality of application scenes of unmanned vehicles, mechanical arms and the like. Thus, the present invention provides a significant and substantial improvement over the prior art.
The above embodiment is only one of the preferred embodiments of the present invention, and should not be used to limit the scope of the present invention, but all the insubstantial modifications or color changes made in the main design concept and spirit of the present invention are still consistent with the present invention, and all the technical problems to be solved are included in the scope of the present invention.

Claims (4)

CN202310861633.0A2023-07-132023-07-13 A data-driven adaptive dynamic programming method for air combat decision makingActiveCN116880186B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202310861633.0ACN116880186B (en)2023-07-132023-07-13 A data-driven adaptive dynamic programming method for air combat decision making

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202310861633.0ACN116880186B (en)2023-07-132023-07-13 A data-driven adaptive dynamic programming method for air combat decision making

Publications (2)

Publication NumberPublication Date
CN116880186A CN116880186A (en)2023-10-13
CN116880186Btrue CN116880186B (en)2024-04-16

Family

ID=88265747

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202310861633.0AActiveCN116880186B (en)2023-07-132023-07-13 A data-driven adaptive dynamic programming method for air combat decision making

Country Status (1)

CountryLink
CN (1)CN116880186B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN118171742B (en)*2024-05-152024-09-24南京理工大学Knowledge-data driven air combat target intention reasoning method and system based on residual estimation
CN118657170A (en)*2024-08-092024-09-17西北工业大学 A strategy iteration algorithm and device

Citations (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109085754A (en)*2018-07-252018-12-25西北工业大学A kind of spacecraft neural network based is pursued and captured an escaped prisoner game method
CN112215283A (en)*2020-10-122021-01-12中国人民解放军海军航空大学Close-range air combat intelligent decision method based on manned/unmanned aerial vehicle system
CN113050686A (en)*2021-03-192021-06-29北京航空航天大学Combat strategy optimization method and system based on deep reinforcement learning
CN113095481A (en)*2021-04-032021-07-09西北工业大学Air combat maneuver method based on parallel self-game
CN113791634A (en)*2021-08-222021-12-14西北工业大学 A decision-making method for multi-aircraft air combat based on multi-agent reinforcement learning
CN114330115A (en)*2021-10-272022-04-12中国空气动力研究与发展中心计算空气动力研究所Neural network air combat maneuver decision method based on particle swarm search
CN115951709A (en)*2023-01-092023-04-11中国人民解放军国防科技大学Multi-unmanned aerial vehicle air combat strategy generation method based on TD3
CN116185059A (en)*2022-08-172023-05-30西北工业大学Unmanned aerial vehicle air combat autonomous evasion maneuver decision-making method based on deep reinforcement learning
CN116187777A (en)*2022-12-282023-05-30中国航空研究院Unmanned aerial vehicle air combat autonomous decision-making method based on SAC algorithm and alliance training
CN116400718A (en)*2023-04-062023-07-07中国人民解放军空军航空大学Unmanned aerial vehicle short-distance air combat maneuver autonomous decision-making method, system, equipment and terminal

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109085754A (en)*2018-07-252018-12-25西北工业大学A kind of spacecraft neural network based is pursued and captured an escaped prisoner game method
CN112215283A (en)*2020-10-122021-01-12中国人民解放军海军航空大学Close-range air combat intelligent decision method based on manned/unmanned aerial vehicle system
CN113050686A (en)*2021-03-192021-06-29北京航空航天大学Combat strategy optimization method and system based on deep reinforcement learning
CN113095481A (en)*2021-04-032021-07-09西北工业大学Air combat maneuver method based on parallel self-game
CN113791634A (en)*2021-08-222021-12-14西北工业大学 A decision-making method for multi-aircraft air combat based on multi-agent reinforcement learning
CN114330115A (en)*2021-10-272022-04-12中国空气动力研究与发展中心计算空气动力研究所Neural network air combat maneuver decision method based on particle swarm search
CN116185059A (en)*2022-08-172023-05-30西北工业大学Unmanned aerial vehicle air combat autonomous evasion maneuver decision-making method based on deep reinforcement learning
CN116187777A (en)*2022-12-282023-05-30中国航空研究院Unmanned aerial vehicle air combat autonomous decision-making method based on SAC algorithm and alliance training
CN115951709A (en)*2023-01-092023-04-11中国人民解放军国防科技大学Multi-unmanned aerial vehicle air combat strategy generation method based on TD3
CN116400718A (en)*2023-04-062023-07-07中国人民解放军空军航空大学Unmanned aerial vehicle short-distance air combat maneuver autonomous decision-making method, system, equipment and terminal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Deep Neural Network-Based Footprint Prediction and Attack Intention Inference of Hypersonic Glide Vehicles;Jingjing Xu, et al.;《Mathematics 》;第11卷(第1期);第1-24页*

Also Published As

Publication numberPublication date
CN116880186A (en)2023-10-13

Similar Documents

PublicationPublication DateTitle
CN116880186B (en) A data-driven adaptive dynamic programming method for air combat decision making
CN113625740B (en)Unmanned aerial vehicle air combat game method based on transfer learning pigeon swarm optimization
CN110879599A (en)Fixed time formation control method based on finite time disturbance observer
CN106647287B (en) An Input-Restricted Differential Game Guidance Method Based on Adaptive Dynamic Programming
CN114330115B (en)Neural network air combat maneuver decision-making method based on particle swarm search
CN108319286A (en)A kind of unmanned plane Air Combat Maneuvering Decision Method based on intensified learning
CN112198892B (en) A multi-UAV intelligent cooperative penetration countermeasure method
CN112906233B (en)Distributed near-end strategy optimization method based on cognitive behavior knowledge and application thereof
CN111752280A (en) A fixed-time control method for multi-unmanned ship formation based on finite-time uncertain observer
CN116501086B (en) A method for aircraft autonomous avoidance decision-making based on reinforcement learning
CN109188909A (en)Self-adaptive fuzzy optimal control method and system for ship course nonlinear discrete system
CN114003059B (en)UAV path planning method based on deep reinforcement learning under kinematic constraint condition
CN114063644B (en) Autonomous decision-making method for unmanned combat aircraft in air combat based on pigeon group reverse confrontation learning
CN117824441B (en) Intelligent collaborative guidance method and system based on BP neural network with time and space constraints
CN111461294B (en)Intelligent aircraft brain cognitive learning method facing dynamic game
CN113297506B (en)Brain-like relative navigation method based on social position cells/grid cells
CN108549210A (en)Multiple no-manned plane based on BP neural network PID control cooperates with flying method
CN110673488A (en)Double DQN unmanned aerial vehicle concealed access method based on priority random sampling strategy
CN116991074B (en)Close-range air combat maneuver decision optimization method under intelligent weight
CN116432438A (en) Track Prediction Method, Equipment and Medium of High Maneuvering Target Based on Dynamic Model
CN118938676A (en) A reinforcement learning guidance and control integrated method for intercepting three-dimensional maneuvering targets
CN117130379A (en) A UAV air combat attack method based on LQR short-range visual range
CN114371729A (en)Unmanned aerial vehicle air combat maneuver decision method based on distance-first experience playback
CN114995129B (en) A distributed optimal event-triggered collaborative guidance method
CN117452963A (en) UAV path planning method and system based on improved particle swarm algorithm

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant
CB03Change of inventor or designer information

Inventor after:Li Bin

Inventor after:Ning Zhaoke

Inventor after:Shi Mingming

Inventor after:Li Qingliang

Inventor before:Li Bin

Inventor before:Ning Zhaoke

Inventor before:Shi Mingming

Inventor before:Li Qingliang

Inventor before:Tao Chenggang

Inventor before:Sun Shaoshan

Inventor before:Li Dao

CB03Change of inventor or designer information

[8]ページ先頭

©2009-2025 Movatter.jp