Movatterモバイル変換


[0]ホーム

URL:


CN114340016B - Power grid edge calculation unloading distribution method and system - Google Patents

Power grid edge calculation unloading distribution method and system
Download PDF

Info

Publication number
CN114340016B
CN114340016BCN202210255851.5ACN202210255851ACN114340016BCN 114340016 BCN114340016 BCN 114340016BCN 202210255851 ACN202210255851 ACN 202210255851ACN 114340016 BCN114340016 BCN 114340016B
Authority
CN
China
Prior art keywords
power terminal
task
edge computing
computing
unloading
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210255851.5A
Other languages
Chinese (zh)
Other versions
CN114340016A (en
Inventor
丰雷
周凡钦
杨洋
杨志祥
喻鹏
李文璟
李阳阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and TelecommunicationsfiledCriticalBeijing University of Posts and Telecommunications
Priority to CN202210255851.5ApriorityCriticalpatent/CN114340016B/en
Publication of CN114340016ApublicationCriticalpatent/CN114340016A/en
Application grantedgrantedCritical
Publication of CN114340016BpublicationCriticalpatent/CN114340016B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The invention provides a method and a system for power grid edge calculation unloading distribution, wherein the method comprises the following steps: acquiring network state information of each power terminal in the smart power grid at the current moment; inputting network state information corresponding to a target power terminal into a power grid edge calculation unloading distribution model to obtain an edge calculation unloading distribution strategy of a to-be-processed calculation task in the target power terminal; and according to the edge computing unloading distribution strategy, the computing tasks to be processed are segmented, and the segmented computing tasks to be processed are cached to the corresponding power terminal and/or the mobile edge computing server so as to carry out edge computing unloading on the computing tasks to be processed. The invention uses the multi-agent reinforcement learning solving algorithm to carry out the edge computing unloading distribution decision, fully utilizes the cache and the computing resources of the power terminal equipment, obtains a more accurate and efficient edge computing unloading distribution scheme, and can effectively reduce the transmission delay of the remote mobile edge computing server.

Description

Translated fromChinese
一种电网边缘计算卸载分配方法及系统A method and system for offloading and distributing power grid edge computing

技术领域technical field

本发明涉及移动边缘计算技术领域,尤其涉及一种电网边缘计算卸载分配方法及系统。The present invention relates to the technical field of mobile edge computing, in particular to a method and system for offloading and distributing power grid edge computing.

背景技术Background technique

随着融合5G的电网快速发展,电力业务终端的数量和产生的流量越来越大,这对于现有电网架构提出了很大的挑战。边缘计算作为一种新的计算模式,使数据在源头附近就能得到及时有效的处理,从而为解决电网中海量数据处理提供了新的解决方案。With the rapid development of the 5G-integrated power grid, the number of power service terminals and the traffic generated are increasing, which poses a great challenge to the existing power grid architecture. As a new computing mode, edge computing enables data to be processed in a timely and effective manner near the source, thus providing a new solution for massive data processing in the power grid.

为保证电网服务质量,目前大多数优化方案,采用将计算任务卸载到移动边缘计算(Mobile Edge Computing,简称MEC)服务器的方式,这显著减轻了核心网络负载压力,大大减少用户请求的传输距离。然而,当面临海量电力用户终端请求时,仅依靠MEC服务器进行任务处理,将导致因计算缓存资源不足所带来的额外的排队延迟,以及多用户终端对通信资源的竞争,仍然使得MEC的内容传输面临网络拥塞的挑战。In order to ensure the quality of grid service, most of the current optimization schemes use the method of offloading computing tasks to Mobile Edge Computing (MEC) servers, which significantly reduces the load pressure on the core network and greatly reduces the transmission distance of user requests. However, when faced with massive power user terminal requests, relying only on the MEC server for task processing will lead to additional queuing delays caused by insufficient computing cache resources, as well as multi-user terminal competition for communication resources. Transport faces the challenge of network congestion.

随着电力终端设备能力的提升,终端辅助计算成为了很有潜力的解决方案。终端间距离较近,明显降低远距离传输延迟,此外,多节点并行计算,计算效率显著提升。但是,现有针对临近终端的边缘计算卸载分配方案还不够完善,无法得到较为准确的边缘计算卸载分配方案。因此,现在亟需一种电网边缘计算卸载分配方法及系统来解决上述问题。With the improvement of power terminal equipment capabilities, terminal-assisted computing has become a promising solution. The distance between terminals is relatively short, which significantly reduces the long-distance transmission delay. In addition, multi-node parallel computing, the computing efficiency is significantly improved. However, the existing edge computing offloading allocation scheme for adjacent terminals is not perfect enough, and a more accurate edge computing offloading allocation scheme cannot be obtained. Therefore, there is an urgent need for a method and system for offloading and distributing power grid edge computing to solve the above problems.

发明内容SUMMARY OF THE INVENTION

针对现有技术存在的问题,本发明提供一种电网边缘计算卸载分配方法及系统。In view of the problems existing in the prior art, the present invention provides a method and system for offloading and distributing power grid edge computing.

本发明提供一种电网边缘计算卸载分配方法,包括:The present invention provides a method for offloading and distributing power grid edge computing, comprising:

获取智能电网中每个电力终端在当前时刻的网络状态信息;Obtain the network status information of each power terminal in the smart grid at the current moment;

将目标电力终端对应的网络状态信息,输入到电网边缘计算卸载分配模型,得到所述目标电力终端中待处理计算任务的边缘计算卸载分配策略;Inputting the network status information corresponding to the target power terminal into the power grid edge computing offloading distribution model to obtain the edge computing offloading allocation strategy of the computing task to be processed in the target power terminal;

根据所述边缘计算卸载分配策略,将所述待处理计算任务进行分割,并将分割后的待处理计算任务缓存到对应的电力终端和/或移动边缘计算服务器,以对所述待处理计算任务进行边缘计算卸载;According to the edge computing offloading and allocation strategy, the to-be-processed computing tasks are divided, and the divided to-be-processed computing tasks are cached in the corresponding power terminal and/or mobile edge computing server, so as to dispose of the to-be-processed computing tasks. Perform edge computing offloading;

其中,所述电网边缘计算卸载分配模型是由样本网络状态信息和所述样本网络状态信息对应的任务缓存比例和任务卸载位置,对多智能体强化学习网络进行训练得到的。The power grid edge computing offload distribution model is obtained by training a multi-agent reinforcement learning network based on the sample network state information and the task cache ratio and task offload position corresponding to the sample network state information.

根据本发明提供的一种电网边缘计算卸载分配方法,所述电网边缘计算卸载分配模型通过以下步骤训练得到:According to a grid edge computing offloading distribution method provided by the present invention, the grid edge computing offloading distribution model is obtained by training through the following steps:

基于每个电力终端的历史网络状态信息,构建各个电力终端对应智能体的样本网络状态信息,并根据所述样本网络状态信息,构建第一样本观测状态;Based on the historical network status information of each power terminal, construct sample network status information of the agent corresponding to each power terminal, and construct a first sample observation status according to the sample network status information;

获取所述样本网络状态信息对应的任务缓存比例和任务卸载位置,并根据所述任务缓存比例和所述任务卸载位置,构建每个智能体的动作;Obtain the task cache ratio and the task unloading position corresponding to the sample network state information, and construct the action of each agent according to the task cache ratio and the task unloading position;

基于每个电力终端在进行边缘计算卸载时的能耗和时延,以每个电力终端的能耗最小化为优化目标,构建智能体的奖励;Based on the energy consumption and delay of each power terminal when the edge computing is offloaded, the optimization goal is to minimize the energy consumption of each power terminal, and the reward of the agent is constructed;

根据所述第一样本观测状态、所述动作和所述奖励,构建训练样本集;constructing a training sample set according to the first sample observation state, the action and the reward;

通过所述训练样本集,对多智能体强化学习网络进行训练,得到电网边缘计算卸载分配模型。Through the training sample set, the multi-agent reinforcement learning network is trained to obtain a power grid edge computing offloading distribution model.

根据本发明提供的一种电网边缘计算卸载分配方法,所述基于每个电力终端在进行边缘计算卸载时的能耗和时延,以每个电力终端的能耗最小化为优化目标,构建智能体的奖励,包括:According to a method for offloading and distributing power grid edge computing provided by the present invention, based on the energy consumption and time delay of each power terminal when performing edge computing offloading, and with the optimization goal of minimizing the energy consumption of each power terminal, an intelligent Physical rewards, including:

根据每个电力终端的计算能耗和传输能耗,获取每个电力终端在进行边缘计算卸载时的能耗;According to the computing energy consumption and transmission energy consumption of each power terminal, the energy consumption of each power terminal when the edge computing is offloaded is obtained;

根据每个电力终端的传输时延和计算时延,获取每个电力终端在进行边缘计算卸载时的时延;According to the transmission delay and calculation delay of each power terminal, obtain the delay of each power terminal when performing edge computing offloading;

将每个电力终端在进行边缘计算卸载时的时延作为约束条件,以每个电力终端的能耗最小化为优化目标,构建电力终端边缘计算卸载能耗优化模型;Taking the time delay of each power terminal during edge computing offloading as a constraint, and taking the minimization of the energy consumption of each power terminal as the optimization goal, an optimization model of power terminal edge computing offloading energy consumption is constructed;

基于所述电力终端边缘计算卸载能耗优化模型,将每一轮训练过程中电力终端的能耗相反数作为对应智能体的奖励。Based on the power terminal edge computing offloading energy consumption optimization model, the inverse energy consumption of the power terminal in each round of training process is used as the reward for the corresponding agent.

根据本发明提供的一种电网边缘计算卸载分配方法,在所述通过所述训练样本集,对多智能体强化学习网络进行训练,得到电网边缘计算卸载分配模型之前,所述方法还包括:According to a grid edge computing offloading distribution method provided by the present invention, before the multi-agent reinforcement learning network is trained through the training sample set to obtain a grid edge computing offloading distribution model, the method further includes:

将所述样本网络状态信息输入到生成对抗网络,输出第二样本观测状态;inputting the sample network state information into the generative adversarial network, and outputting the second sample observation state;

根据所述第二样本观测状态,对所述训练样本集进行更新,得到更新后的训练样本集;updating the training sample set according to the second sample observation state to obtain an updated training sample set;

所述通过所述训练样本集,对多智能体强化学习网络进行训练,得到电网边缘计算卸载分配模型,包括:The multi-agent reinforcement learning network is trained through the training sample set to obtain a power grid edge computing offloading distribution model, including:

通过所述更新后的训练样本集,对多智能体强化学习网络进行训练,得到电网边缘计算卸载分配模型。Through the updated training sample set, the multi-agent reinforcement learning network is trained to obtain a power grid edge computing offloading distribution model.

根据本发明提供的一种电网边缘计算卸载分配方法,所述电力终端边缘计算卸载能耗优化模型的公式为:According to a power grid edge computing offloading distribution method provided by the present invention, the formula of the power terminal edge computing offloading energy consumption optimization model is:

Figure 904921DEST_PATH_IMAGE001
Figure 904921DEST_PATH_IMAGE001

其中,

Figure 386718DEST_PATH_IMAGE002
表示第i个电力终端在t时刻进行边缘计算卸载时的能耗,
Figure 597120DEST_PATH_IMAGE003
表示第i个电力终端的待计算任务在第j个电力终端的缓存比例;
Figure 11921DEST_PATH_IMAGE004
为任务卸载动作,表示第i个电力终端的待计算任务在第j个电力终端的计算动作;in,
Figure 386718DEST_PATH_IMAGE002
represents the energy consumption of the i-th power terminal when the edge computing is unloaded at time t,
Figure 597120DEST_PATH_IMAGE003
Indicates the cache ratio of the task to be calculated at the i-th power terminal in the j-th power terminal;
Figure 11921DEST_PATH_IMAGE004
is the task unloading action, indicating the computing action of the task to be calculated at the i-th power terminal at the j-th power terminal;

约束条件为:The constraints are:

Figure 79103DEST_PATH_IMAGE005
Figure 79103DEST_PATH_IMAGE005
;

Figure 844933DEST_PATH_IMAGE006
Figure 844933DEST_PATH_IMAGE006
;

Figure 796709DEST_PATH_IMAGE007
Figure 796709DEST_PATH_IMAGE007
;

Figure 66016DEST_PATH_IMAGE008
Figure 66016DEST_PATH_IMAGE008
;

Figure 507362DEST_PATH_IMAGE009
Figure 507362DEST_PATH_IMAGE009
;

Figure 840384DEST_PATH_IMAGE010
Figure 840384DEST_PATH_IMAGE010
;

Figure 127008DEST_PATH_IMAGE011
Figure 127008DEST_PATH_IMAGE011
;

Figure 781981DEST_PATH_IMAGE012
Figure 781981DEST_PATH_IMAGE012
;

Figure 128648DEST_PATH_IMAGE013
Figure 128648DEST_PATH_IMAGE013
;

其中,

Figure 541175DEST_PATH_IMAGE014
表示第i个电力终端与第j个电力终端之间的网络连接状态,
Figure 897070DEST_PATH_IMAGE015
表示完成第i个电力终端的待计算任务的边缘计算卸载和传输的时延,
Figure 140970DEST_PATH_IMAGE016
表示预设时延阈值,
Figure 986435DEST_PATH_IMAGE017
表示第i个电力终端的待计算任务中需要被缓存的任务量,
Figure 948575DEST_PATH_IMAGE018
表示移动边缘计算服务器的缓存总容量,
Figure 842581DEST_PATH_IMAGE019
表示任意电力终端的缓存总容量。in,
Figure 541175DEST_PATH_IMAGE014
represents the network connection status between the i-th power terminal and the j-th power terminal,
Figure 897070DEST_PATH_IMAGE015
represents the delay of edge computing offloading and transmission to complete the task to be computed of the i-th power terminal,
Figure 140970DEST_PATH_IMAGE016
represents the preset delay threshold,
Figure 986435DEST_PATH_IMAGE017
represents the amount of tasks that need to be cached in the tasks to be calculated in the i-th power terminal,
Figure 948575DEST_PATH_IMAGE018
represents the total cache capacity of the mobile edge computing server,
Figure 842581DEST_PATH_IMAGE019
Indicates the total cache capacity of any power terminal.

根据本发明提供的一种电网边缘计算卸载分配方法,所述样本网络状态信息包括网络连接状态、计算能力、缓存能力、待缓存卸载任务计算量和缓存卸载后任务传输量。According to a grid edge computing offloading distribution method provided by the present invention, the sample network state information includes network connection state, computing capability, cache capability, the computation amount of tasks to be cached offloaded, and the task transmission amount after cache offloading.

本发明还提供一种电网边缘计算卸载分配系统,包括:The present invention also provides a grid edge computing offloading distribution system, including:

电力终端网络状态采集模块,用于获取智能电网中每个电力终端在当前时刻的网络状态信息;The power terminal network status acquisition module is used to obtain the network status information of each power terminal in the smart grid at the current moment;

电网边缘计算卸载分配策略生成模块,用于将目标电力终端对应的网络状态信息,输入到电网边缘计算卸载分配模型,得到所述目标电力终端中待处理计算任务的边缘计算卸载分配策略;The grid edge computing offloading allocation strategy generation module is used for inputting the network status information corresponding to the target power terminal into the grid edge computing offloading allocation model to obtain the edge computing offloading allocation strategy of the computing task to be processed in the target power terminal;

边缘计算卸载模块,用于根据所述边缘计算卸载分配策略,将所述待处理计算任务进行分割,并将分割后的待处理计算任务缓存到对应的电力终端和/或移动边缘计算服务器,以对所述待处理计算任务进行边缘计算卸载;An edge computing offloading module, configured to divide the to-be-processed computing task according to the edge computing offloading allocation strategy, and cache the divided to-be-processed computing tasks to the corresponding power terminal and/or mobile edge computing server, so as to performing edge computing offload on the to-be-processed computing task;

其中,所述电网边缘计算卸载分配模型是由样本网络状态信息和所述样本网络状态信息对应的任务缓存比例和任务卸载位置,对多智能体强化学习网络进行训练得到的。The power grid edge computing offload distribution model is obtained by training a multi-agent reinforcement learning network based on the sample network state information and the task cache ratio and task offload position corresponding to the sample network state information.

根据本发明提供的一种电网边缘计算卸载分配系统,所述系统还包括:According to a grid edge computing offloading distribution system provided by the present invention, the system further includes:

样本构建模块,用于基于每个电力终端的历史网络状态信息,构建各个电力终端对应智能体的样本网络状态信息,并根据所述样本网络状态信息,构建第一样本观测状态;a sample construction module, configured to construct sample network status information of the agent corresponding to each power terminal based on the historical network status information of each power terminal, and construct a first sample observation status according to the sample network status information;

动作构建模块,用于获取所述样本网络状态信息对应的任务缓存比例和任务卸载位置,并根据所述任务缓存比例和所述任务卸载位置,构建每个智能体的动作;an action building module, configured to obtain the task cache ratio and the task unloading position corresponding to the sample network state information, and construct the action of each agent according to the task cache ratio and the task unloading position;

智能体奖励构建模块,用于基于每个电力终端在进行边缘计算卸载时的能耗和时延,以每个电力终端的能耗最小化为优化目标,构建智能体的奖励;The agent reward building module is used to build the reward of the agent based on the energy consumption and delay of each power terminal when the edge computing is offloaded, with the optimization goal of minimizing the energy consumption of each power terminal;

训练集生成模块,用于根据所述第一样本观测状态、所述动作和所述奖励,构建训练样本集;a training set generation module, configured to construct a training sample set according to the first sample observation state, the action and the reward;

训练模块,用于通过所述训练样本集,对多智能体强化学习网络进行训练,得到电网边缘计算卸载分配模型。The training module is used to train the multi-agent reinforcement learning network through the training sample set to obtain a power grid edge computing offloading distribution model.

本发明还提供一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现如上述任一种所述电网边缘计算卸载分配方法的步骤。The present invention also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and running on the processor, the processor implementing the power grid edge computing as described above when the processor executes the program Steps to uninstall the distribution method.

本发明还提供一种非暂态计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如上述任一种所述电网边缘计算卸载分配方法的步骤。The present invention also provides a non-transitory computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the steps of any of the above-mentioned grid edge computing offloading and distribution methods.

本发明提供的一种电网边缘计算卸载分配方法及系统,通过构建移动边缘计算服务器与电力终端协作的混合式缓存与卸载框架,使用多智能体强化学习求解算法进行边缘计算卸载分配决策,充分利用电力终端设备的缓存和计算资源,得到更为准确且高效的边缘计算卸载分配方案,从而解决以往多任务请求时,单一依靠移动边缘计算服务器进行边缘计算,而面临的资源不足与网络拥塞等问题,并且终端间的近距离协作,可有效降低远距离移动边缘计算服务器的传输时延。The invention provides a method and system for offloading and distributing power grid edge computing. By constructing a hybrid caching and offloading framework in which a mobile edge computing server and power terminals cooperate, a multi-agent reinforcement learning algorithm is used to make edge computing offloading and distribution decisions. The cache and computing resources of the power terminal equipment are used to obtain a more accurate and efficient edge computing offloading allocation scheme, so as to solve the problems such as insufficient resources and network congestion when relying solely on the mobile edge computing server for edge computing in the past multi-task requests. , and the close cooperation between terminals can effectively reduce the transmission delay of the long-distance mobile edge computing server.

附图说明Description of drawings

为了更清楚地说明本发明或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图进行简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings that are required in the description of the embodiments or the prior art. Obviously, the drawings in the following description are of the present invention. For some embodiments of the present invention, for those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.

图1为本发明提供的电网边缘计算卸载分配方法的流程示意图;1 is a schematic flowchart of a method for offloading and distributing power grid edge computing provided by the present invention;

图2为本发明提供的电网边缘计算卸载分配系统的结构示意图;2 is a schematic structural diagram of a grid edge computing offloading and distribution system provided by the present invention;

图3为本发明提供的电子设备的结构示意图。FIG. 3 is a schematic structural diagram of an electronic device provided by the present invention.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚,下面将结合本发明中的附图,对本发明中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the objectives, technical solutions and advantages of the present invention clearer, the technical solutions in the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are part of the embodiments of the present invention. , not all examples. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

随着电力终端设备能力的提升,临近终端辅助计算成为了很有潜力的解决方案。由于每个电力终端之间的距离较近,可明显降低远距离传输延迟;此外,多个电力终端作为节点并行计算,计算效率显著提升,使得电力终端终端设备从资源消耗者变为资源提供者,改善了系统资源利用率。同时,为了平衡计算与通信资源消耗,电力终端设备以及MEC服务器可预先缓存相关资源,在后续进行边缘计算卸载时,可以直接对这些资源进行计算处理,并将计算结果传输到发起边缘计算卸载任务的目标电力终端。With the improvement of power terminal equipment capabilities, near-terminal assisted computing has become a promising solution. Due to the short distance between each power terminal, the long-distance transmission delay can be significantly reduced; in addition, multiple power terminals are used as nodes for parallel computing, and the computing efficiency is significantly improved, making the power terminal equipment change from resource consumers to resource providers. , which improves system resource utilization. At the same time, in order to balance the consumption of computing and communication resources, the power terminal equipment and the MEC server can cache related resources in advance, and when the edge computing offloading is performed subsequently, these resources can be directly calculated and processed, and the calculation results can be transmitted to initiating the edge computing offloading task. target power terminal.

在现有方案中,主要是将一些固有的新技术引入边缘计算任务卸载分配中,但是没有考虑在电网中设备计算与存储资源,以及服务请求动态变化的环境,大部分的决策过程都是以整个电网的最低能耗作为目标,导致实际的边缘计算任务卸载决策中,出现计算卸载分配不均匀,可能会出现部分节点分配较多的计算卸载任务,而有些节点分配到较少的计算卸载任务。因此,现有方案在进行边缘计算任务卸载决策时,算法的收敛速度和性能还有待得到进一步提高。In the existing scheme, some inherent new technologies are mainly introduced into the offloading assignment of edge computing tasks, but the computing and storage resources of equipment in the power grid and the dynamically changing environment of service requests are not considered. Most of the decision-making processes are based on The minimum energy consumption of the entire power grid is the target, which leads to uneven distribution of computing offloading in the actual edge computing task offloading decision. Some nodes may be allocated more computing offloading tasks, while some nodes are allocated less computing offloading tasks. . Therefore, the convergence speed and performance of the algorithm need to be further improved when the existing solutions make edge computing task offloading decisions.

针对现有技术中存在的问题,本发明提供了一种智能电网中多智能体终端与MEC协作的计算缓存联合优化机制。在本发明中,依据电力终端用户请求和网络状态信息,对于任意一个电力终端设备中待处理的计算任务内容,可预先在MEC服务器、临近用户终端(即临近电力终端,与待进行计算任务的电力终端之间网络连接)和本地节点(即待进行计算任务的电力终端)等三个位置进行缓存分割、计算内容传输(包括待进行处理的计算任务以及已完成计算的计算任务结果)以及计算处理,其缓存计算策略共分为:本地节点(缓存)和本地节点(计算),MEC服务器(缓存)和MEC服务器/本地节点(计算),临近终端(缓存)和临近终端/本地节点(计算)。本发明采用多智能体强化学习,求解智能电网中任意一个电力终端的计算任务对应的最优缓存卸载策略,该问题可以归结为资源分配和任务卸载的联合优化,是一个合作与竞争的混合问题。此外,本发明为了使多智能体框架能够考虑全面以及极端情况,高效应对不同环境状态下的策略制定,提出了一种基于生成对抗网络(GenerativeAdversarial Network,简称GAN)的有经验智能体训练方法,从而进一步提高算法的收敛速度和性能。Aiming at the problems existing in the prior art, the present invention provides a computing cache joint optimization mechanism in which a multi-agent terminal and an MEC cooperate in a smart grid. In the present invention, according to the user request of the power terminal and the network status information, for the content of the computing task to be processed in any power terminal device, the content of the calculation task to be processed in the MEC server, the adjacent user terminal (that is, the adjacent power terminal, and the calculation task to be performed can be preliminarily stored in the MEC server. The network connection between the power terminals) and the local node (that is, the power terminal to be performed the computing task) and other three locations for cache segmentation, calculation content transmission (including the calculation tasks to be processed and the calculation task results that have been completed), and calculation processing, its cache computing strategy is divided into: local node (cache) and local node (computing), MEC server (cache) and MEC server/local node (computing), adjacent terminal (cache) and adjacent terminal/local node (computing) ). The invention adopts multi-agent reinforcement learning to solve the optimal cache unloading strategy corresponding to the computing task of any power terminal in the smart grid. This problem can be attributed to the joint optimization of resource allocation and task unloading, which is a mixed problem of cooperation and competition. . In addition, the present invention proposes an experienced agent training method based on Generative Adversarial Network (GAN) in order to enable the multi-agent framework to consider comprehensive and extreme situations and efficiently deal with policy formulation in different environmental states. Thereby, the convergence speed and performance of the algorithm are further improved.

图1为本发明提供的电网边缘计算卸载分配方法的流程示意图,如图1所示,本发明提供了一种电网边缘计算卸载分配方法,包括:1 is a schematic flowchart of a method for offloading and distributing power grid edge computing provided by the present invention. As shown in FIG. 1 , the present invention provides a method for offloading and allocating power grid edge computing, including:

步骤101,获取智能电网中每个电力终端在当前时刻的网络状态信息。Step 101: Obtain network status information of each power terminal in the smart grid at the current moment.

在本发明中,实时获取电力终端的网络状态信息,其中,网络状态信息包括电力终端之间的网络连接状态,电力终端的CPU频率(即该电力终端的计算能力),电力终端的缓存容量,电力终端的待缓存卸载任务计算量(即电力终端中待处理的计算任务需要被缓存的计算内容大小),电力终端的卸载后任务传输量(即电力终端i缓存到其他节点的卸载任务,在其他节点完成计算后传输到电力终端i的计算任务结果总量)。需要说明的是,本发明除了获取电力终端之间的网络状态信息时,还可以获取电力终端与MEC服务器之间的网络状态信息,即获取电力终端与MEC服务器之间的网络连接状态,以及MEC服务器的CPU频率等。In the present invention, the network status information of the power terminal is acquired in real time, wherein the network status information includes the network connection status between the power terminals, the CPU frequency of the power terminal (ie the computing capability of the power terminal), the cache capacity of the power terminal, The calculation amount of the unloaded tasks to be cached by the power terminal (that is, the size of the calculation content that needs to be cached for the computing tasks to be processed in the power terminal), and the amount of post-unloaded task transmission of the power terminal (that is, the unloaded tasks cached by the power terminal i to other nodes, in The total amount of computing task results transmitted to power terminal i after other nodes have completed their calculations). It should be noted that, in addition to obtaining the network status information between the power terminals, the present invention can also obtain the network status information between the power terminal and the MEC server, that is, the network connection status between the power terminal and the MEC server, and the MEC server. The server's CPU frequency, etc.

由于本发明提供的电网边缘计算卸载分配模型在进行边缘计算卸载分配时,同时也考虑了MEC服务器的网络状态。因此,该模型在训练过程时,结合了电力终端和MEC服务器的网络状态信息,使得模型根据电力终端在当前时刻的网络状态信息,在考虑是否将计算任务卸载到临近电力终端(也可能将计算任务在本地终端处理完成)的同时,还考虑是否需要将计算任务卸载到MEC服务器处理。Because the power grid edge computing offload distribution model provided by the present invention also takes the network state of the MEC server into consideration when performing the edge computing offload distribution. Therefore, during the training process of the model, the network status information of the power terminal and the MEC server is combined, so that the model is considering whether to offload the computing task to the adjacent power terminal (or may also calculate the calculation task according to the current network status information of the power terminal). When the task is processed at the local terminal), it is also considered whether the computing task needs to be offloaded to the MEC server for processing.

步骤102,将目标电力终端对应的网络状态信息,输入到电网边缘计算卸载分配模型,得到所述目标电力终端中待处理计算任务的边缘计算卸载分配策略。Step 102: Input the network status information corresponding to the target power terminal into the grid edge computing offloading distribution model to obtain the edge computing offloading allocation strategy of the computing tasks to be processed in the target power terminal.

在本发明中,将拥有计算任务的电力终端作为目标电力终端,通过上述实施例获取到目标电力终端对应的网络状态信息之后,将该网络状态信息输入到电网边缘计算卸载模型中,其中,在本发明中,目标电力终端对应的网络状态信息,除了目标电力终端自身的网络状态信息之外,还包括与目标电力终端临近的其他电力终端的网络状态信息(这些临近的电力终端的网络状态信息主要包括网络连接状态,电力终端的CPU频率等)。该电网边缘计算卸载模型用于实现电网中任务最优缓存与卸载策略,涉及到缓存以及计算等资源的合理分配。模型从体验质量(Quality of Experience,简称QoE)和设备功率有限的角度出发,选择延迟和能耗作为优化目标。因此,在实际应用中,每个电力终端在进行边缘计算卸载时,只关注自身的业务质量(即时延)和能源消耗。In the present invention, a power terminal with computing tasks is used as a target power terminal, and after obtaining the network status information corresponding to the target power terminal through the above embodiment, the network status information is input into the grid edge computing offloading model, wherein, in In the present invention, the network status information corresponding to the target power terminal, in addition to the network status information of the target power terminal itself, also includes network status information of other power terminals adjacent to the target power terminal (the network status information of these adjacent power terminals). It mainly includes the network connection status, the CPU frequency of the power terminal, etc.). The grid edge computing offloading model is used to realize the optimal cache and offload strategy of tasks in the grid, which involves the rational allocation of resources such as cache and computing. The model selects latency and energy consumption as optimization goals from the perspective of Quality of Experience (QoE) and limited device power. Therefore, in practical applications, each power terminal only pays attention to its own service quality (ie delay) and energy consumption when offloading edge computing.

进一步地,电网边缘计算卸载模型基于目标电力终端的网络状态信息,决策出最优的边缘计算卸载分配策略,该分配策略涉及到目标电力终端中待进行计算任务的任务分割比例,以及待缓存的节点位置(即分割后的任务需要缓存到哪些临近电力终端进行任务处理)。Further, the grid edge computing offloading model decides the optimal edge computing offloading allocation strategy based on the network state information of the target power terminal, and the allocation strategy involves the task division ratio of the computing tasks to be performed in the target power terminal, and the amount of the data to be cached. Node location (that is, to which adjacent power terminals the split task needs to be cached for task processing).

步骤103,根据所述边缘计算卸载分配策略,将所述待处理计算任务进行分割,并将分割后的待处理计算任务缓存到对应的电力终端和/或移动边缘计算服务器,以对所述待处理计算任务进行边缘计算卸载;Step 103: According to the edge computing offloading allocation strategy, the to-be-processed computing tasks are divided, and the divided to-be-processed computing tasks are cached in the corresponding power terminal and/or mobile edge computing server, so that the to-be-processed computing tasks are cached. Processing computing tasks for edge computing offloading;

其中,所述电网边缘计算卸载分配模型是由样本网络状态信息和所述样本网络状态信息对应的任务缓存比例和任务卸载位置,对多智能体强化学习网络进行训练得到的。The power grid edge computing offload distribution model is obtained by training a multi-agent reinforcement learning network based on the sample network state information and the task cache ratio and task offload position corresponding to the sample network state information.

在本发明中,根据目标电力终端与其他节点(包括MEC服务器以及至少1个临近电力终端)的网路连接状态、计算能力、缓存能力和任务特征等信息,采用多智能体强化学习(Multi-Agent Deep Deterministic Policy Gradient,简称MADDPG)框架生成的高效资源分配和任务卸载决策,实现每个电力终端在进行边缘计算卸载传输的能耗最小化目标,从而确定目标电力终端中的待处理计算任务,在对应临近电力终端和/或MEC服务器中的缓存比例,以进行任务的缓存与卸载计算。In the present invention, multi-agent reinforcement learning (Multi-agent reinforcement learning) is adopted according to the network connection status, computing capability, caching capability and task characteristics of the target power terminal and other nodes (including the MEC server and at least one adjacent power terminal). The efficient resource allocation and task offloading decisions generated by the Agent Deep Deterministic Policy Gradient (MADDPG) framework can achieve the goal of minimizing the energy consumption of each power terminal during edge computing offload transmission, thereby determining the pending computing tasks in the target power terminal. The cache ratio in the corresponding adjacent power terminal and/or MEC server to perform the task cache and offload calculation.

本发明提供的电网边缘计算卸载分配方法,通过构建移动边缘计算服务器与电力终端协作的混合式缓存与卸载框架,使用多智能体强化学习求解算法进行边缘计算卸载分配决策,充分利用电力终端设备的缓存和计算资源,得到更为准确且高效的边缘计算卸载分配方案,从而解决以往多任务请求时,单一依靠移动边缘计算服务器进行边缘计算,而面临的资源不足与网络拥塞等问题,并且终端间的近距离协作,可有效降低远距离移动边缘计算服务器的传输时延。The method for offloading and distributing power grid edge computing provided by the present invention constructs a hybrid caching and offloading framework in which the mobile edge computing server and the power terminal cooperate, uses a multi-agent reinforcement learning algorithm to make edge computing offloading and distribution decisions, and makes full use of the power terminal equipment. Cache and computing resources, get a more accurate and efficient edge computing offloading allocation scheme, so as to solve the problems such as insufficient resources and network congestion faced by relying solely on the mobile edge computing server for edge computing in the past multi-task requests, and between terminals. It can effectively reduce the transmission delay of long-distance mobile edge computing servers.

在上述实施例的基础上,所述电网边缘计算卸载分配模型通过以下步骤训练得到:On the basis of the above embodiment, the grid edge computing offloading distribution model is obtained by training the following steps:

基于每个电力终端的历史网络状态信息,构建各个电力终端对应智能体的样本网络状态信息,并根据所述样本网络状态信息,构建第一样本观测状态;所述样本网络状态信息包括网络连接状态、计算能力、缓存能力、待缓存卸载任务计算量和缓存卸载后任务传输量;Based on the historical network status information of each power terminal, construct sample network status information of the agent corresponding to each power terminal, and construct a first sample observation status according to the sample network status information; the sample network status information includes network connections Status, computing power, cache capacity, the amount of task calculation to be unloaded from the cache, and the amount of task transfer after the cache is unloaded;

获取所述样本网络状态信息对应的任务缓存比例和任务卸载位置,并根据所述任务缓存比例和所述任务卸载位置,构建每个智能体的动作;Obtain the task cache ratio and the task unloading position corresponding to the sample network state information, and construct the action of each agent according to the task cache ratio and the task unloading position;

基于每个电力终端在进行边缘计算卸载时的能耗和时延,以每个电力终端的能耗最小化为优化目标,构建智能体的奖励;Based on the energy consumption and delay of each power terminal when the edge computing is offloaded, the optimization goal is to minimize the energy consumption of each power terminal, and the reward of the agent is constructed;

根据所述第一样本观测状态、所述动作和所述奖励,构建训练样本集;constructing a training sample set according to the first sample observation state, the action and the reward;

通过所述训练样本集,对多智能体强化学习网络进行训练,得到电网边缘计算卸载分配模型。Through the training sample set, the multi-agent reinforcement learning network is trained to obtain a power grid edge computing offload distribution model.

在本发明中,MADDPG网络以集中学习和分散执行的方式工作,即每个智能体根据自身策略得到当前状态执行的动作,并与环境交互,从而得到经验存入自身的经验缓存池。待所有智能体与环境交互后,每个智能体从经验池(即训练样本集)中随机抽取经验,训练各自的神经网络。在本发明中,MADDPG网络中的多智能体、状态、动作以及奖励函数的设计如下:In the present invention, the MADDPG network works in the mode of centralized learning and decentralized execution, that is, each agent obtains the actions performed by the current state according to its own strategy, and interacts with the environment, so as to obtain experience and store it in its own experience buffer pool. After all agents interact with the environment, each agent randomly selects experience from the experience pool (ie, the training sample set) to train its own neural network. In the present invention, the design of the multi-agent, state, action and reward function in the MADDPG network is as follows:

智能体:所有用户终端,即电力终端。Agent: All user terminals, namely power terminals.

动作:根据电力终端任务的缓存比例

Figure 144250DEST_PATH_IMAGE003
,即待处理计算任务经过分割之后分配到每个节点的缓存比例;以及卸载位置
Figure 832720DEST_PATH_IMAGE004
,即任务是否在对应节点进行计算,构成动作
Figure 282156DEST_PATH_IMAGE020
。因为MADDPG用于解决连续变量求解,但模型中卸载位置
Figure 714274DEST_PATH_IMAGE021
为离散变量,所以,本发明将变量转化为:Action: According to the cache ratio of the power terminal task
Figure 144250DEST_PATH_IMAGE003
, that is, the proportion of the cache allocated to each node after the to-be-processed computing task is divided; and the unloading location
Figure 832720DEST_PATH_IMAGE004
, that is, whether the task is calculated at the corresponding node and constitutes an action
Figure 282156DEST_PATH_IMAGE020
. Because MADDPG is used to solve continuous variables, but the unloaded position in the model
Figure 714274DEST_PATH_IMAGE021
is a discrete variable, so the present invention converts the variable into:

Figure 266521DEST_PATH_IMAGE022
Figure 266521DEST_PATH_IMAGE022
;

因此,

Figure 125893DEST_PATH_IMAGE023
。therefore,
Figure 125893DEST_PATH_IMAGE023
.

状态:每个智能体的本地状态(即网络状态信息)包括终端间的网络连接状态

Figure 62625DEST_PATH_IMAGE014
、计算能力
Figure 236117DEST_PATH_IMAGE024
、缓存能力
Figure 574695DEST_PATH_IMAGE025
缓存内容大小
Figure 604968DEST_PATH_IMAGE026
(目标电力终端中待缓存到其他节点进行计算的卸载任务计算量)以及计算内容大小
Figure 91313DEST_PATH_IMAGE027
(目标电力终端缓存到其他节点的卸载任务,在完成计算后通过返回到目标电力终端进行整合,得到的缓存卸载任务计算结果,即通过边缘计算卸载后传输的结果数据的大小),即:State: The local state of each agent (that is, network state information) includes the state of network connections between terminals
Figure 62625DEST_PATH_IMAGE014
, Calculate ability
Figure 236117DEST_PATH_IMAGE024
, cache capability
Figure 574695DEST_PATH_IMAGE025
cache content size
Figure 604968DEST_PATH_IMAGE026
(The amount of offloading tasks in the target power terminal to be cached to other nodes for calculation) and the size of the calculation content
Figure 91313DEST_PATH_IMAGE027
(The unloading task cached by the target power terminal to other nodes is integrated by returning to the target power terminal after the calculation is completed, and the calculation result of the cache unloading task is obtained, that is, the size of the result data transmitted after unloading through edge computing), namely:

Figure 599654DEST_PATH_IMAGE028
Figure 599654DEST_PATH_IMAGE028
.

所有智能体在时隙t,即第t个时刻的联合状态为:The joint state of all agents at time slot t, that is, the t-th time, is:

Figure 120634DEST_PATH_IMAGE029
Figure 120634DEST_PATH_IMAGE029
.

奖励:为实现能耗最小化,进行边端协作优化任务缓存与卸载的目标,将每个智能体的奖励r设为其对应终端用户的能耗相反数,即

Figure 321809DEST_PATH_IMAGE030
。Reward: In order to minimize energy consumption and optimize task caching and offloading by side-end collaboration, the reward r of each agent is set as the opposite number of energy consumption of its corresponding end user, namely
Figure 321809DEST_PATH_IMAGE030
.

进一步地,在训练过程中,为加速智能体的学习过程,在本发明中,Critic网络的输入主要包括其他智能体的观察状态和采取的动作,通过最小化损失以更新Critic 网络参数,进而通过梯度下降法计算更新动作网络的参数。Further, in the training process, in order to accelerate the learning process of the agent, in the present invention, the input of the Critic network mainly includes the observation states and actions taken by other agents, and the Critic network parameters are updated by minimizing the loss, and then through The gradient descent method computes the parameters of the updated action network.

具体地,在MADDPG算法中,智能体i的连续策略

Figure 701974DEST_PATH_IMAGE031
通过关于
Figure 545166DEST_PATH_IMAGE032
的目标函数梯度进行优化:Specifically, in the MADDPG algorithm, the continuous policy of agent i
Figure 701974DEST_PATH_IMAGE031
by about
Figure 545166DEST_PATH_IMAGE032
The objective function gradient is optimized:

Figure 796018DEST_PATH_IMAGE033
Figure 796018DEST_PATH_IMAGE033
;

其中,

Figure 162234DEST_PATH_IMAGE034
是集中式的actor-value函数;
Figure 295276DEST_PATH_IMAGE035
为动作,
Figure 738895DEST_PATH_IMAGE036
为奖励;
Figure 906571DEST_PATH_IMAGE037
为所有智能体的新状态,即下一轮训练过程中的智能体对应的观测状态;
Figure 449548DEST_PATH_IMAGE038
表示经验存储,该元组被存储在经验回放池中,即构建用于训练的样本集;
Figure 273148DEST_PATH_IMAGE039
表示
Figure 926983DEST_PATH_IMAGE040
个智能体的策略集合;
Figure 214745DEST_PATH_IMAGE041
表示
Figure 928623DEST_PATH_IMAGE040
个智能体策略的参数;
Figure 98573DEST_PATH_IMAGE042
表示所有智能体的观测状态。每个智能体可以根据本地观测状态,做出独立的决策,即
Figure 556099DEST_PATH_IMAGE043
。in,
Figure 162234DEST_PATH_IMAGE034
is a centralized actor-value function;
Figure 295276DEST_PATH_IMAGE035
for action,
Figure 738895DEST_PATH_IMAGE036
for reward;
Figure 906571DEST_PATH_IMAGE037
is the new state of all agents, that is, the observed state corresponding to the agent in the next round of training;
Figure 449548DEST_PATH_IMAGE038
Represents experience storage, the tuple is stored in the experience playback pool, that is, the sample set for training is constructed;
Figure 273148DEST_PATH_IMAGE039
express
Figure 926983DEST_PATH_IMAGE040
The set of policies of an agent;
Figure 214745DEST_PATH_IMAGE041
express
Figure 928623DEST_PATH_IMAGE040
parameters of an agent's strategy;
Figure 98573DEST_PATH_IMAGE042
Represents the observed state of all agents. Each agent can make independent decisions based on the local observation state, namely
Figure 556099DEST_PATH_IMAGE043
.

因此,每个Critic网络就可以获得所有智能体的状态和动作行为。然后,根据损失函数更新智能体i的集中动作值函数, 即 Critic网络的训练通过如下Loss函数:Therefore, each critical network can obtain the states and actions of all agents. Then, the centralized action value function of agent i is updated according to the loss function, that is, the training of the Critic network passes the following Loss function:

Figure 698368DEST_PATH_IMAGE044
Figure 698368DEST_PATH_IMAGE044
;

Figure 583147DEST_PATH_IMAGE045
Figure 583147DEST_PATH_IMAGE045
.

在上述实施例的基础上,所述基于每个电力终端在进行边缘计算卸载时的能耗和时延,以每个电力终端的能耗最小化为优化目标,构建智能体的奖励,包括:On the basis of the above embodiment, based on the energy consumption and delay of each power terminal when performing edge computing offloading, and with the optimization goal of minimizing the energy consumption of each power terminal, the reward for constructing the agent includes:

根据每个电力终端的计算能耗和传输能耗,获取每个电力终端在进行边缘计算卸载时的能耗;According to the computing energy consumption and transmission energy consumption of each power terminal, the energy consumption of each power terminal when the edge computing is offloaded is obtained;

根据每个电力终端的传输时延和计算时延,获取每个电力终端在进行边缘计算卸载时的时延;According to the transmission delay and calculation delay of each power terminal, obtain the delay of each power terminal when performing edge computing offloading;

将每个电力终端在进行边缘计算卸载时的时延作为约束条件,以每个电力终端的能耗最小化为优化目标,构建电力终端边缘计算卸载能耗优化模型;Taking the time delay of each power terminal during edge computing offloading as a constraint, and taking the minimization of the energy consumption of each power terminal as the optimization goal, an optimization model of power terminal edge computing offloading energy consumption is constructed;

基于所述电力终端边缘计算卸载能耗优化模型,将每一轮训练过程中电力终端的能耗相反数作为对应智能体的奖励。Based on the power terminal edge computing offloading energy consumption optimization model, the inverse energy consumption of the power terminal in each round of training process is used as the reward for the corresponding agent.

在本发明中,在对多智能体强化学习网络进行训练的场景中,该场景包含1个MEC服务器

Figure 115759DEST_PATH_IMAGE046
,以及多个电力终端
Figure 382836DEST_PATH_IMAGE047
。具体地,本发明设计了三种缓存卸载模式:模式1,本地缓存/卸载,即计算任务在本地终端进行处理;模式2,临近终端缓存/卸载,即将计算任务在相邻的1个或多个邻近终端进行处理;模式3,MEC缓存/卸载,将计算任务在终端所属的MEC服务器进行处理。在本发明中,为了实现资源的合理利用,减少任务处理时延,每个电力终端的计算任务将动态划分为不同比例进行缓存和卸载最优模式选择。In the present invention, in the scenario of training a multi-agent reinforcement learning network, the scenario includes one MEC server
Figure 115759DEST_PATH_IMAGE046
, and multiple power terminals
Figure 382836DEST_PATH_IMAGE047
. Specifically, the present invention designs three cache unloading modes: mode 1, local cache/unload, that is, computing tasks are processed at the local terminal; Each adjacent terminal performs processing; Mode 3, MEC caching/unloading, processes computing tasks on the MEC server to which the terminal belongs. In the present invention, in order to realize rational utilization of resources and reduce task processing delay, the computing tasks of each power terminal are dynamically divided into different proportions for optimal mode selection of buffering and unloading.

具体地,在电网边缘计算卸载分配模型的应用场景中,

Figure 379611DEST_PATH_IMAGE048
,表示第i个电力终端(为了方便描述,第i个电力终端可作为目标电力终端)的任务在第j个节点(节点可以是本地电力终端或其他临近电力终端,也可以是MEC服务器)的缓存比例,因此,第i个电力终端计算任务在其他节点的缓存比例可表示为:Specifically, in the application scenario of the grid edge computing offload distribution model,
Figure 379611DEST_PATH_IMAGE048
, indicating that the task of the i-th power terminal (for the convenience of description, the i-th power terminal can be used as the target power terminal) is at the j-th node (the node can be a local power terminal or other nearby power terminals, or it can be an MEC server). cache ratio, therefore, the cache ratio of the i-th power terminal computing task in other nodes can be expressed as:

Figure 435291DEST_PATH_IMAGE049
Figure 435291DEST_PATH_IMAGE049

Figure 579834DEST_PATH_IMAGE050
Figure 579834DEST_PATH_IMAGE050

进一步地,

Figure 644742DEST_PATH_IMAGE051
表示第i个电力终端的计算任务在节点j计算动作:further,
Figure 644742DEST_PATH_IMAGE051
Indicates that the computing task of the ith power terminal computes the action at node j:

Figure 496023DEST_PATH_IMAGE052
Figure 496023DEST_PATH_IMAGE052

Figure 722605DEST_PATH_IMAGE053
Figure 722605DEST_PATH_IMAGE053

Figure 229810DEST_PATH_IMAGE054
Figure 229810DEST_PATH_IMAGE054

其中,

Figure 832829DEST_PATH_IMAGE055
表示第i个电力终端的计算任务在节点j计算;否则,
Figure 538617DEST_PATH_IMAGE056
。in,
Figure 832829DEST_PATH_IMAGE055
Indicates that the computing task of the ith power terminal is computed at node j; otherwise,
Figure 538617DEST_PATH_IMAGE056
.

Figure 201680DEST_PATH_IMAGE057
表示节点之间的网络连接状态,
Figure 55235DEST_PATH_IMAGE058
表示第i个电力终端与节点j连接,可以进行任务的缓存与卸载计算;否则,
Figure 727525DEST_PATH_IMAGE059
Figure 201680DEST_PATH_IMAGE057
Represents the network connection status between nodes,
Figure 55235DEST_PATH_IMAGE058
Indicates that the i-th power terminal is connected to node j, and can perform task caching and offloading calculations; otherwise,
Figure 727525DEST_PATH_IMAGE059
.

进一步地,构建训练场景中的缓存模型:Further, build the cached model in the training scene:

假设第i个电力终端的任务需要被缓存的内容为

Figure 22240DEST_PATH_IMAGE017
,缓存内容经计算后形成的输出内容为
Figure 77044DEST_PATH_IMAGE060
,对于需要被缓存的内容
Figure 90000DEST_PATH_IMAGE017
,具体约束条件为:Assume that the task of the i-th power terminal needs to be cached as
Figure 22240DEST_PATH_IMAGE017
, the output content formed after the cache content is calculated is
Figure 77044DEST_PATH_IMAGE060
, for content that needs to be cached
Figure 90000DEST_PATH_IMAGE017
, the specific constraints are:

Figure 300401DEST_PATH_IMAGE061
Figure 300401DEST_PATH_IMAGE061
;

Figure 715202DEST_PATH_IMAGE062
Figure 715202DEST_PATH_IMAGE062
;

其中,

Figure 47963DEST_PATH_IMAGE018
表示MEC服务器的最大缓存容量,
Figure 813794DEST_PATH_IMAGE063
表示电力终端的最大缓存容量,以上公式表示目标电力终端缓存到其他节点的计算内容,不能超过MEC服务器和电力终端的最大缓存容量。in,
Figure 47963DEST_PATH_IMAGE018
Indicates the maximum cache capacity of the MEC server,
Figure 813794DEST_PATH_IMAGE063
Represents the maximum cache capacity of the power terminal. The above formula represents the calculation content cached by the target power terminal to other nodes, which cannot exceed the maximum cache capacity of the MEC server and the power terminal.

进一步地,

Figure 562307DEST_PATH_IMAGE064
,该公式表示第i个电力终端的任务所对应的计算内容被系统(即用于计算卸载的节点,包括本地终端、临近终端和MEC服务器)完整缓存。further,
Figure 562307DEST_PATH_IMAGE064
, this formula indicates that the computing content corresponding to the task of the ith power terminal is completely cached by the system (ie, the node used for computing offload, including the local terminal, adjacent terminals, and MEC server).

进一步地,构建训练场景中每个节点的计算模型:Further, build a computational model for each node in the training scene:

Figure 34877DEST_PATH_IMAGE065
表示MEC服务器的CPU频率(单位:cycle/s);
Figure 210643DEST_PATH_IMAGE066
表示电力终端的CPU频率。节点i(即第i个电力终端)的任务将由MEC服务器以及临近终端协作完成,其计算能耗由对应的各部分计算能耗组成,表示如下:
Figure 34877DEST_PATH_IMAGE065
Indicates the CPU frequency of the MEC server (unit: cycle/s);
Figure 210643DEST_PATH_IMAGE066
Indicates the CPU frequency of the power terminal. The task of node i (i.e. the i-th power terminal) will be completed by the MEC server and the adjacent terminals.

Figure 198191DEST_PATH_IMAGE067
Figure 198191DEST_PATH_IMAGE067
;

其中,

Figure 750395DEST_PATH_IMAGE068
表示每cycle消耗的能耗,k为与CPU相关常数,
Figure 467684DEST_PATH_IMAGE069
Figure 548772DEST_PATH_IMAGE070
为常数,表示计算每bit需要多少cycle。in,
Figure 750395DEST_PATH_IMAGE068
Indicates the energy consumption per cycle, k is a constant related to the CPU,
Figure 467684DEST_PATH_IMAGE069
;
Figure 548772DEST_PATH_IMAGE070
is a constant, indicating how many cycles are required to calculate each bit.

对于节点i的任务,在进行边缘计算卸载时,计算时延的公式表示如下:For the task of node i, when edge computing is offloaded, the formula for computing delay is as follows:

Figure 289195DEST_PATH_IMAGE071
Figure 289195DEST_PATH_IMAGE071
;

进一步地,构建训练场景中节点的通信模型:Further, build the communication model of the nodes in the training scene:

第i个电力终端的任务缓存内容,或计算完成内容将由MEC服务器和临近终端进行内容传输。节点j与节点i之间的传输速率计算如下:The content of the task cache of the i-th power terminal, or the content after the calculation is completed, will be transmitted by the MEC server and the adjacent terminal. The transmission rate between node j and node i is calculated as follows:

Figure 385370DEST_PATH_IMAGE072
Figure 385370DEST_PATH_IMAGE072
;

Figure 832532DEST_PATH_IMAGE073
Figure 832532DEST_PATH_IMAGE073
;

其中,

Figure 350101DEST_PATH_IMAGE074
表示节点j与节点i之间的传输带宽,
Figure 312241DEST_PATH_IMAGE075
表示节点j与节点i之间的信干噪比,
Figure 940669DEST_PATH_IMAGE076
表示节点j的发射功率,
Figure 632550DEST_PATH_IMAGE077
表示节点j与节点i之间信道增益,
Figure 586599DEST_PATH_IMAGE078
为白噪声。in,
Figure 350101DEST_PATH_IMAGE074
represents the transmission bandwidth between node j and node i,
Figure 312241DEST_PATH_IMAGE075
represents the signal-to-interference-noise ratio between node j and node i,
Figure 940669DEST_PATH_IMAGE076
represents the transmit power of node j,
Figure 632550DEST_PATH_IMAGE077
represents the channel gain between node j and node i,
Figure 586599DEST_PATH_IMAGE078
is white noise.

如果

Figure 973718DEST_PATH_IMAGE079
Figure 405837DEST_PATH_IMAGE080
表示节点j无缓存无计算,所以不会产生内容传输,无传输能耗;如果
Figure 624329DEST_PATH_IMAGE081
Figure 749279DEST_PATH_IMAGE080
表示节点j有缓存无计算,所以缓存内容将传输给本地节点i计算,那么对于节点i卸载的任务,节点j的传输能耗计算公式如下:if
Figure 973718DEST_PATH_IMAGE079
,
Figure 405837DEST_PATH_IMAGE080
Indicates that node j has no cache and no calculation, so there will be no content transmission and no transmission energy consumption; if
Figure 624329DEST_PATH_IMAGE081
,
Figure 749279DEST_PATH_IMAGE080
Indicates that node j has cache and no calculation, so the cache content will be transmitted to local node i for calculation, then for the task unloaded by node i, the calculation formula of node j's transmission energy consumption is as follows:

Figure 718558DEST_PATH_IMAGE083
Figure 718558DEST_PATH_IMAGE083
;

如果

Figure 57136DEST_PATH_IMAGE084
,无论
Figure 81549DEST_PATH_IMAGE085
值为多少,节点j计算形成的内容都将传输给本地节点i进行整合,传输能耗如下所示:if
Figure 57136DEST_PATH_IMAGE084
,regardless
Figure 81549DEST_PATH_IMAGE085
What is the value, the content calculated by node j will be transmitted to the local node i for integration, and the transmission energy consumption is as follows:

Figure 443260DEST_PATH_IMAGE086
Figure 443260DEST_PATH_IMAGE086
;

综上所述,节点i的在进行边缘计算卸载时,传输能耗为:To sum up, when node i performs edge computing offloading, the transmission energy consumption is:

Figure 217181DEST_PATH_IMAGE087
Figure 217181DEST_PATH_IMAGE087
;

节点i的传输时延取决于并行传输过程中最长的时延,具体公式如下所示:The transmission delay of node i depends on the longest delay in the parallel transmission process, and the specific formula is as follows:

Figure 410265DEST_PATH_IMAGE088
Figure 410265DEST_PATH_IMAGE088
;

进一步地,节点i总能耗由计算能耗和传输能耗组成,公式如下:Further, the total energy consumption of node i is composed of computing energy consumption and transmission energy consumption, and the formula is as follows:

Figure 611440DEST_PATH_IMAGE089
Figure 611440DEST_PATH_IMAGE089
;

节点i总时延为所有处理其任务的节点j的传输时延与计算时延之和的最大值,公式为:The total delay of node i is the maximum value of the sum of the transmission delay and calculation delay of all nodes j that process its tasks. The formula is:

Figure 585081DEST_PATH_IMAGE090
Figure 585081DEST_PATH_IMAGE090

Figure 162693DEST_PATH_IMAGE091
Figure 162693DEST_PATH_IMAGE091
;

进一步地,为了实现计算任务的缓存与卸载优化,以及分布式网络中的终端节能。本发明以每个业务终端的能耗最小化为目标,在上述实施例的基础上,构建电力终端边缘计算卸载能耗优化模型,所述电力终端边缘计算卸载能耗优化模型的公式为:Further, in order to realize the optimization of caching and offloading of computing tasks, and the terminal energy saving in the distributed network. The present invention aims to minimize the energy consumption of each service terminal. On the basis of the above embodiment, an optimization model of energy consumption for power terminal edge computing offloading is constructed. The formula of the power terminal edge computing offloading energy consumption optimization model is:

Figure 210283DEST_PATH_IMAGE001
Figure 210283DEST_PATH_IMAGE001

其中,

Figure 785621DEST_PATH_IMAGE002
表示第i个电力终端在t时刻进行边缘计算卸载时的能耗,
Figure 918662DEST_PATH_IMAGE003
表示第i个电力终端的待计算任务在第j个电力终端的缓存比例;
Figure 34385DEST_PATH_IMAGE004
为任务卸载动作,表示第i个电力终端的待计算任务在第j个电力终端的计算动作;in,
Figure 785621DEST_PATH_IMAGE002
represents the energy consumption of the i-th power terminal when the edge computing is unloaded at time t,
Figure 918662DEST_PATH_IMAGE003
Indicates the cache ratio of the task to be calculated at the i-th power terminal in the j-th power terminal;
Figure 34385DEST_PATH_IMAGE004
is the task unloading action, indicating the computing action of the task to be calculated at the i-th power terminal at the j-th power terminal;

约束条件为:The constraints are:

Figure 202062DEST_PATH_IMAGE092
;公式(1)
Figure 202062DEST_PATH_IMAGE092
;Formula 1)

Figure 745038DEST_PATH_IMAGE093
公式(2)
Figure 745038DEST_PATH_IMAGE093
Formula (2)

Figure 693272DEST_PATH_IMAGE094
;公式(3)
Figure 693272DEST_PATH_IMAGE094
; Equation (3)

Figure 618546DEST_PATH_IMAGE095
公式(4)
Figure 618546DEST_PATH_IMAGE095
Formula (4)

Figure 375149DEST_PATH_IMAGE096
公式(5)
Figure 375149DEST_PATH_IMAGE096
Formula (5)

Figure 557869DEST_PATH_IMAGE097
公式(6)
Figure 557869DEST_PATH_IMAGE097
Formula (6)

Figure 399923DEST_PATH_IMAGE098
Figure 123028DEST_PATH_IMAGE099
Figure 999717DEST_PATH_IMAGE100
公式(7)
Figure 399923DEST_PATH_IMAGE098
,
Figure 123028DEST_PATH_IMAGE099
Figure 999717DEST_PATH_IMAGE100
Formula (7)

Figure 212393DEST_PATH_IMAGE101
公式(8)
Figure 212393DEST_PATH_IMAGE101
Formula (8)

Figure 807322DEST_PATH_IMAGE102
公式(9)
Figure 807322DEST_PATH_IMAGE102
Formula (9)

其中,

Figure 68539DEST_PATH_IMAGE014
表示第i个电力终端与第j个电力终端之间的网络连接状态,
Figure 799735DEST_PATH_IMAGE015
表示完成第i个电力终端的待计算任务的边缘计算卸载和传输的时延,
Figure 58678DEST_PATH_IMAGE016
表示预设时延阈值,
Figure 140903DEST_PATH_IMAGE017
表示第i个电力终端的待计算任务中需要被缓存的任务量,
Figure 940232DEST_PATH_IMAGE018
表示移动边缘计算服务器的缓存总容量,
Figure 853830DEST_PATH_IMAGE019
表示任意电力终端的缓存总容量。具体地,在上述约束条件中,公式(1)表示第i个电力终端的待计算任务被完整缓存,公式(2)表示待计算任务只能缓存卸载到与本地节点存在网络连接的节点,公式(3)表示任意一个电力终端i都有至少一个节点进行任务计算,公式(4)表示节点j无缓存内容时将不进行计算,公式(5)表示保证每个节点最多处理一个任务,公式(6)表示每个任务的传输时延与处理时延不能超过预设时延阈值,公式(7)表示约束变量取值范围,公式(8)和公式(9)表示所有任务的缓存不能超过总缓存容量。in,
Figure 68539DEST_PATH_IMAGE014
represents the network connection status between the i-th power terminal and the j-th power terminal,
Figure 799735DEST_PATH_IMAGE015
represents the delay of edge computing offloading and transmission to complete the task to be computed of the i-th power terminal,
Figure 58678DEST_PATH_IMAGE016
represents the preset delay threshold,
Figure 140903DEST_PATH_IMAGE017
represents the amount of tasks that need to be cached in the tasks to be calculated in the i-th power terminal,
Figure 940232DEST_PATH_IMAGE018
represents the total cache capacity of the mobile edge computing server,
Figure 853830DEST_PATH_IMAGE019
Indicates the total cache capacity of any power terminal. Specifically, in the above constraints, formula (1) indicates that the task to be calculated of the ith power terminal is completely cached, and formula (2) indicates that the task to be calculated can only be cached and offloaded to the node that has a network connection with the local node, the formula (3) means that any power terminal i has at least one node for task calculation, formula (4) means that node j will not perform calculation when there is no cache content, formula (5) means that each node is guaranteed to process at most one task, formula ( 6) Indicates that the transmission delay and processing delay of each task cannot exceed the preset delay threshold, formula (7) indicates the value range of the constraint variable, and formulas (8) and (9) indicate that the cache of all tasks cannot exceed the total. cache capacity.

最后,为实现每个电力终端能耗进行边端协作优化任务缓存与卸载的能耗最小化,将每个智能体的奖励设为其对应终端用户的能耗相反数,即

Figure 629149DEST_PATH_IMAGE103
。Finally, in order to minimize the energy consumption of the side-end collaborative optimization task caching and unloading of the energy consumption of each power terminal, the reward of each agent is set as the opposite of the energy consumption of its corresponding end user, namely
Figure 629149DEST_PATH_IMAGE103
.

在上述实施例的基础上,在所述通过所述训练样本集,对多智能体强化学习网络进行训练,得到电网边缘计算卸载分配模型之前,所述方法还包括:On the basis of the above embodiment, before the multi-agent reinforcement learning network is trained through the training sample set to obtain the grid edge computing offloading distribution model, the method further includes:

将所述样本网络状态信息输入到生成对抗网络,输出第二样本观测状态;inputting the sample network state information into the generative adversarial network, and outputting the second sample observation state;

根据所述第二样本观测状态,对所述训练样本集进行更新,得到更新后的训练样本集;updating the training sample set according to the second sample observation state to obtain an updated training sample set;

所述通过所述训练样本集,对多智能体强化学习网络进行训练,得到电网边缘计算卸载分配模型,包括:The multi-agent reinforcement learning network is trained through the training sample set to obtain a power grid edge computing offloading distribution model, including:

通过所述更新后的训练样本集,对多智能体强化学习网络进行训练,得到电网边缘计算卸载分配模型。Through the updated training sample set, the multi-agent reinforcement learning network is trained to obtain a power grid edge computing offload distribution model.

GAN的主要结构包括一个生成器G(Generator)和一个判别器D(Discriminator),其中,生成器G用于生成数据,其分布类似于真实数据分布Z;鉴别器D 用于尝试区分样本是来自生成器G生成的数据,还是真实数据分布Z。为减少实际应用中经验学习不均衡,使得MADDPG算法中每个智能体能够充分学习到全面的不同状态下的经验,即在不同网络连接状态以及计算能力等状态下,终端任务的缓存与卸载决策。因此,本发明提出了基于分布式GAN-MADDPG的框架,通过GAN网络使用MADDPG经验池中的观测状态(包含电力网络连接状态以及网络资源信息的真实数据集),生成包含极端状态的合成状态;然后,将合成状态对应的合成经验(即第二观测状态)与真实经验(即第一观测状态)共同输入MADDPG的智能体进行训练,通过利用GAN来学习极端事件和消除数据集偏差,对智能体观察状态进行增强,以训练更有经验的智能体,创建一个有全面经验的多智能体代理,从而高效应对不同环境状态下的策略制定,具有快速收敛速度和良好性能等优点。The main structure of GAN includes a generator G (Generator) and a discriminator D (Discriminator), where the generator G is used to generate data whose distribution is similar to the real data distribution Z; the discriminator D is used to try to distinguish samples from The data generated by the generator G is still the real data distribution Z. In order to reduce the imbalance of experience learning in practical applications, each agent in the MADDPG algorithm can fully learn a comprehensive experience in different states, that is, under different network connection states and computing power, the caching and unloading decisions of terminal tasks. . Therefore, the present invention proposes a framework based on distributed GAN-MADDPG, and uses the observed state in the MADDPG experience pool (the real data set containing the power network connection state and network resource information) through the GAN network to generate a synthetic state containing extreme states; Then, the synthetic experience corresponding to the synthetic state (ie the second observation state) and the real experience (ie the first observation state) are jointly input into the MADDPG agent for training. The body observation state is enhanced to train more experienced agents, creating a multi-agent agent with comprehensive experience, so as to efficiently deal with policy formulation in different environmental states, with the advantages of fast convergence speed and good performance.

具体地,每个智能体在MADDPG的Actor-Critic架构的基础上增加有一个GAN网络,用于对其观测状态

Figure 933091DEST_PATH_IMAGE104
(即第二样本观测状态)进行生成,其生成的观测状态,与由MADDPG网络的Actor网络生成相应的动作
Figure 270532DEST_PATH_IMAGE105
、奖励
Figure 179582DEST_PATH_IMAGE106
及下一时隙观测状态
Figure 577065DEST_PATH_IMAGE107
组成完整经验存入经验回放池,这使得经验池存储的经验更加全面,用于智能体的训练。因此,GAN的目标即优化发生器G和鉴别器D,用公式表示如下:Specifically, each agent adds a GAN network based on the Actor-Critic architecture of MADDPG to observe its state
Figure 933091DEST_PATH_IMAGE104
(that is, the second sample observation state) is generated, and the generated observation state corresponds to the corresponding action generated by the Actor network of the MADDPG network
Figure 270532DEST_PATH_IMAGE105
,award
Figure 179582DEST_PATH_IMAGE106
and the next time slot observation status
Figure 577065DEST_PATH_IMAGE107
The complete experience is stored in the experience playback pool, which makes the experience stored in the experience pool more comprehensive and used for the training of the agent. Therefore, the goal of GAN is to optimize the generator G and the discriminator D, which is formulated as follows:

Figure 633883DEST_PATH_IMAGE108
Figure 633883DEST_PATH_IMAGE108
;

其中,

Figure 775014DEST_PATH_IMAGE109
表示真实样本与生成样本之间的差异程度;
Figure 663205DEST_PATH_IMAGE110
表示固定生成器G,尽可能地让判别器能够最大化地判别出样本来自于真实数据还是生成的数据;令
Figure 762748DEST_PATH_IMAGE111
Figure 775703DEST_PATH_IMAGE112
表示在固定判别器D的条件下得到生成器G,这个G要求能够最小化真实样本与生成样本的差异。通过上述min max的博弈过程,使得收敛于生成分布,拟合于真实分布,从而在智能体训练过程中,使用GAN网络对电力网络环境状态进行模拟,通过不同状态下的经验增强智能体的经验,进而有效保证了电网中最优缓存与卸载策略,实现终端节能的优化目标。in,
Figure 775014DEST_PATH_IMAGE109
Represents the degree of difference between the real sample and the generated sample;
Figure 663205DEST_PATH_IMAGE110
Represents a fixed generator G, so that the discriminator can determine as much as possible whether the sample comes from real data or generated data; let
Figure 762748DEST_PATH_IMAGE111
,
Figure 775703DEST_PATH_IMAGE112
Indicates that the generator G is obtained under the condition of a fixed discriminator D, and this G is required to minimize the difference between the real sample and the generated sample. Through the above game process of min max, it converges to the generated distribution and fits to the real distribution, so that in the training process of the agent, the GAN network is used to simulate the environmental state of the power network, and the experience of the agent is enhanced through the experience in different states. , and then effectively ensure the optimal caching and unloading strategy in the power grid, and achieve the optimization goal of terminal energy saving.

本发明提出了一种基于GAN-MADDPG的有经验智能体训练机制,将每个电力终端作为智能体以实现自身能耗最小化为目标,采用MADDPG多智能体强化学习算法求解最优任务缓存与卸载决策;然后,使用GAN网络对智能体的观测状态进行生成,相应的合成经验有效弥补了真实经验分布不均的缺点,使得训练出更有经验的智能体,在面对从未遇到的网络与资源状态,能够高效准确的给出优化策略,具有更快收敛速度和更高的样本效率。The invention proposes an experienced agent training mechanism based on GAN-MADDPG, which takes each power terminal as an agent to minimize its own energy consumption as the goal, and adopts MADDPG multi-agent reinforcement learning algorithm to solve the optimal task cache and Unloading decision; then, the GAN network is used to generate the observed state of the agent, and the corresponding synthetic experience can effectively make up for the disadvantage of uneven distribution of real experience, so that more experienced agents can be trained, in the face of never encountered The network and resource status can give the optimization strategy efficiently and accurately, with faster convergence speed and higher sample efficiency.

下面对本发明提供的电网边缘计算卸载分配系统进行描述,下文描述的电网边缘计算卸载分配系统与上文描述的电网边缘计算卸载分配方法可相互对应参照。The grid edge computing offloading and distribution system provided by the present invention is described below. The grid edge computing offloading and distribution system described below and the grid edge computing offloading and distribution method described above can be referred to each other correspondingly.

图2为本发明提供的电网边缘计算卸载分配系统的结构示意图,如图2所示,本发明提供了一种电网边缘计算卸载分配系统,包括电力终端网络状态采集模块201、电网边缘计算卸载分配策略生成模块202和边缘计算卸载模块203,其中,电力终端网络状态采集模块201用于获取智能电网中每个电力终端在当前时刻的网络状态信息;电网边缘计算卸载分配策略生成模块202用于将目标电力终端对应的网络状态信息,输入到电网边缘计算卸载分配模型,得到所述目标电力终端中待处理计算任务的边缘计算卸载分配策略;边缘计算卸载模块203用于根据所述边缘计算卸载分配策略,将所述待处理计算任务进行分割,并将分割后的待处理计算任务缓存到对应的电力终端和/或移动边缘计算服务器,以对所述待处理计算任务进行边缘计算卸载;FIG. 2 is a schematic structural diagram of a grid edge computing offload distribution system provided by the present invention. As shown in FIG. 2 , the present invention provides a grid edge computing offload distribution system, including a power terminal networkstate acquisition module 201, a grid edge computing offload distribution system Thestrategy generation module 202 and the edgecomputing offload module 203, wherein, the power terminal networkstate acquisition module 201 is used to obtain the network status information of each power terminal in the smart grid at the current moment; the grid edge computing offload distributionstrategy generation module 202 is used to The network status information corresponding to the target power terminal is input into the grid edge computing offloading distribution model to obtain the edge computing offloading allocation strategy for the computing tasks to be processed in the target power terminal; the edgecomputing offloading module 203 is used for the edge computing offloading and allocation according to the edge computing. a strategy, dividing the to-be-processed computing task, and buffering the divided to-be-processed computing task to the corresponding power terminal and/or mobile edge computing server, so as to perform edge computing offloading of the to-be-processed computing task;

其中,所述电网边缘计算卸载分配模型是由样本网络状态信息和所述样本网络状态信息对应的任务缓存比例和任务卸载位置,对多智能体强化学习网络进行训练得到的。The power grid edge computing offload distribution model is obtained by training a multi-agent reinforcement learning network based on the sample network state information and the task cache ratio and task offload position corresponding to the sample network state information.

本发明提供的电网边缘计算卸载分配系统,通过构建移动边缘计算服务器与电力终端协作的混合式缓存与卸载框架,使用多智能体强化学习求解算法进行边缘计算卸载分配决策,充分利用电力终端设备的缓存和计算资源,得到更为准确且高效的边缘计算卸载分配方案,从而解决以往多任务请求时,单一依靠移动边缘计算服务器进行边缘计算,而面临的资源不足与网络拥塞等问题,并且终端间的近距离协作,可有效降低远距离移动边缘计算服务器的传输时延。The grid edge computing offloading and distribution system provided by the present invention, by constructing a hybrid caching and offloading framework in which the mobile edge computing server and the power terminal cooperate, uses the multi-agent reinforcement learning algorithm to make edge computing offloading and distribution decisions, and makes full use of the power terminal equipment. Cache and computing resources, get a more accurate and efficient edge computing offloading allocation scheme, so as to solve the problems such as insufficient resources and network congestion faced by relying solely on the mobile edge computing server for edge computing in the past multi-task requests, and between terminals. It can effectively reduce the transmission delay of long-distance mobile edge computing servers.

在上述实施例的基础上,所述系统还包括样本构建模块、动作标签标记模块、智能体奖励构建模块、训练集生成模块和训练模块,其中,样本构建模块用于基于每个电力终端的历史网络状态信息,构建各个电力终端对应智能体的样本网络状态信息,并根据所述样本网络状态信息,构建第一样本观测状态;动作标签标记模块用于获取所述样本网络状态信息对应的任务缓存比例和任务卸载位置,并根据所述任务缓存比例和所述任务卸载位置,构建每个智能体的动作;智能体奖励构建模块用于基于每个电力终端在进行边缘计算卸载时的能耗和时延,以每个电力终端的能耗最小化为优化目标,构建智能体的奖励;训练集生成模块用于根据所述第一样本观测状态、所述动作和所述奖励,构建训练样本集;训练模块用于通过所述训练样本集,对多智能体强化学习网络进行训练,得到电网边缘计算卸载分配模型。On the basis of the above-mentioned embodiment, the system further includes a sample building module, an action label marking module, an agent reward building module, a training set generation module and a training module, wherein the sample building module is used based on the history of each power terminal Network status information, construct the sample network status information of the agent corresponding to each power terminal, and construct the first sample observation status according to the sample network status information; the action tag marking module is used to obtain the task corresponding to the sample network status information Cache ratio and task unloading position, and construct the action of each agent according to the task cache ratio and the task unloading position; the agent reward building module is used for the energy consumption of each power terminal when performing edge computing unloading and delay, and the optimization goal is to minimize the energy consumption of each power terminal to construct the reward of the agent; the training set generation module is used to construct a training set based on the observed state, the action and the reward of the first sample. A sample set; the training module is used to train a multi-agent reinforcement learning network through the training sample set to obtain a power grid edge computing offloading distribution model.

本发明提供的系统是用于执行上述各方法实施例的,具体流程和详细内容请参照上述实施例,此处不再赘述。The system provided by the present invention is used to execute the above-mentioned method embodiments. For specific procedures and details, please refer to the above-mentioned embodiments, which will not be repeated here.

图3为本发明提供的电子设备的结构示意图,如图3所示,该电子设备可以包括:处理器(Processor)301、通信接口(Communications Interface)302、存储器(Memory)303和通信总线304,其中,处理器301,通信接口302,存储器303通过通信总线304完成相互间的通信。处理器301可以调用存储器303中的逻辑指令,以执行电网边缘计算卸载分配方法,该方法包括:获取智能电网中每个电力终端在当前时刻的网络状态信息;将目标电力终端对应的网络状态信息,输入到电网边缘计算卸载分配模型,得到所述目标电力终端中待处理计算任务的边缘计算卸载分配策略;根据所述边缘计算卸载分配策略,将所述待处理计算任务进行分割,并将分割后的待处理计算任务缓存到对应的电力终端和/或移动边缘计算服务器,以对所述待处理计算任务进行边缘计算卸载;其中,所述电网边缘计算卸载分配模型是由样本网络状态信息和所述样本网络状态信息对应的任务缓存比例和任务卸载位置,对多智能体强化学习网络进行训练得到的。FIG. 3 is a schematic structural diagram of an electronic device provided by the present invention. As shown in FIG. 3 , the electronic device may include: a processor (Processor) 301, a communication interface (Communications Interface) 302, a memory (Memory) 303 and acommunication bus 304, Theprocessor 301 , thecommunication interface 302 , and thememory 303 communicate with each other through thecommunication bus 304 . Theprocessor 301 can call the logic instructions in thememory 303 to execute the grid edge computing offloading distribution method, the method includes: acquiring the network status information of each power terminal in the smart grid at the current moment; , input into the grid edge computing offloading distribution model to obtain the edge computing offloading allocation strategy of the computing task to be processed in the target power terminal; according to the edge computing offloading allocation strategy, the to-be-processed computing task is divided, and the segmentation The subsequent pending computing tasks are cached to the corresponding power terminal and/or mobile edge computing server, so as to perform edge computing offloading for the pending computing tasks; wherein, the grid edge computing offloading allocation model is determined by the sample network state information and The task cache ratio and task unloading position corresponding to the sample network state information are obtained by training a multi-agent reinforcement learning network.

此外,上述的存储器303中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。In addition, the above-mentioned logic instructions in thememory 303 can be implemented in the form of software functional units and can be stored in a computer-readable storage medium when sold or used as an independent product. Based on this understanding, the technical solution of the present invention can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution. The computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes .

另一方面,本发明还提供一种计算机程序产品,所述计算机程序产品包括存储在非暂态计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,计算机能够执行上述各方法所提供的电网边缘计算卸载分配方法,该方法包括:获取智能电网中每个电力终端在当前时刻的网络状态信息;将目标电力终端对应的网络状态信息,输入到电网边缘计算卸载分配模型,得到所述目标电力终端中待处理计算任务的边缘计算卸载分配策略;根据所述边缘计算卸载分配策略,将所述待处理计算任务进行分割,并将分割后的待处理计算任务缓存到对应的电力终端和/或移动边缘计算服务器,以对所述待处理计算任务进行边缘计算卸载;其中,所述电网边缘计算卸载分配模型是由样本网络状态信息和所述样本网络状态信息对应的任务缓存比例和任务卸载位置,对多智能体强化学习网络进行训练得到的。In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions, when the program instructions are executed by a computer When executing, the computer can execute the grid edge computing offloading and distribution method provided by the above methods, and the method includes: acquiring the network status information of each power terminal in the smart grid at the current moment; inputting the network status information corresponding to the target power terminal into Go to the grid edge computing offloading distribution model to obtain the edge computing offloading distribution strategy of the computing task to be processed in the target power terminal; according to the edge computing offloading allocation strategy, the to-be-processed computing task is divided, The to-be-processed computing tasks are cached in the corresponding power terminal and/or the mobile edge computing server, so as to perform edge computing offloading on the to-be-processed computing tasks; wherein, the grid edge computing offloading allocation model is determined by the sample network state information and the The task cache ratio and task unloading position corresponding to the sample network state information are obtained by training the multi-agent reinforcement learning network.

又一方面,本发明还提供一种非暂态计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现以执行上述各实施例提供的电网边缘计算卸载分配方法,该方法包括:获取智能电网中每个电力终端在当前时刻的网络状态信息;将目标电力终端对应的网络状态信息,输入到电网边缘计算卸载分配模型,得到所述目标电力终端中待处理计算任务的边缘计算卸载分配策略;根据所述边缘计算卸载分配策略,将所述待处理计算任务进行分割,并将分割后的待处理计算任务缓存到对应的电力终端和/或移动边缘计算服务器,以对所述待处理计算任务进行边缘计算卸载;其中,所述电网边缘计算卸载分配模型是由样本网络状态信息和所述样本网络状态信息对应的任务缓存比例和任务卸载位置,对多智能体强化学习网络进行训练得到的。In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, it is implemented to execute the grid edge computing offloading and distribution methods provided by the above embodiments, The method includes: acquiring the network status information of each power terminal in the smart grid at the current moment; inputting the network status information corresponding to the target power terminal into the grid edge computing offloading distribution model to obtain the to-be-processed computing task in the target power terminal according to the edge computing offloading allocation strategy; according to the edge computing offloading allocation strategy, the to-be-processed computing tasks are divided, and the divided to-be-processed computing tasks are cached in the corresponding power terminals and/or mobile edge computing servers to Perform edge computing offloading on the to-be-processed computing tasks; wherein, the power grid edge computing offloading allocation model is based on the sample network state information and the task cache ratio and task offloading position corresponding to the sample network state information to strengthen the multi-agent The learning network is trained.

以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。The device embodiments described above are only illustrative, wherein the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in One place, or it can be distributed over multiple network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment. Those of ordinary skill in the art can understand and implement it without creative effort.

通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件。基于这样的理解,上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。From the description of the above embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus a necessary general hardware platform, and certainly can also be implemented by hardware. Based on this understanding, the above-mentioned technical solutions can be embodied in the form of software products in essence or the parts that make contributions to the prior art, and the computer software products can be stored in computer-readable storage media, such as ROM/RAM, magnetic A disc, an optical disc, etc., includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the methods described in various embodiments or some parts of the embodiments.

最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that it can still be The technical solutions described in the foregoing embodiments are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (7)

1. A grid edge computing offload distribution method is characterized by comprising the following steps:
acquiring network state information of each power terminal in the smart grid at the current moment;
inputting network state information corresponding to a target power terminal into a power grid edge calculation unloading distribution model to obtain an edge calculation unloading distribution strategy of a to-be-processed calculation task in the target power terminal;
according to the edge computing unloading distribution strategy, the computing tasks to be processed are segmented, and the segmented computing tasks to be processed are cached to the corresponding power terminal and/or the mobile edge computing server so as to carry out edge computing unloading on the computing tasks to be processed;
the power grid edge calculation unloading distribution model is obtained by training a multi-agent reinforcement learning network according to sample network state information and a task cache proportion and a task unloading position corresponding to the sample network state information;
the power grid edge calculation unloading distribution model is obtained by training through the following steps:
based on historical network state information of each power terminal, constructing sample network state information of an agent corresponding to each power terminal, and constructing a first sample observation state according to the sample network state information;
acquiring a task cache proportion and a task unloading position corresponding to the sample network state information, and constructing the action of each intelligent agent according to the task cache proportion and the task unloading position;
constructing rewards of intelligent agents by taking minimization of the energy consumption of each power terminal as an optimization target based on the energy consumption and time delay of each power terminal during edge calculation unloading;
constructing a training sample set according to the first sample observation state, the action and the reward;
training the multi-agent reinforcement learning network through the training sample set to obtain a power grid edge calculation unloading distribution model;
before the training the multi-agent reinforcement learning network through the training sample set to obtain a power grid edge computing unloading distribution model, the method further includes:
inputting the sample network state information into a generation countermeasure network, and outputting a second sample observation state;
updating the training sample set according to the observation state of the second sample to obtain an updated training sample set;
the training of the multi-agent reinforcement learning network through the training sample set to obtain the power grid edge calculation unloading distribution model comprises the following steps:
and training the multi-agent reinforcement learning network through the updated training sample set to obtain a power grid edge calculation unloading distribution model.
2. The grid edge computing offload distribution method according to claim 1, wherein constructing the reward of the agent with the energy consumption of each power terminal minimized as an optimization objective based on the energy consumption and time delay of each power terminal when performing the edge computing offload comprises:
acquiring the energy consumption of each power terminal when performing edge calculation unloading according to the calculation energy consumption and transmission energy consumption of each power terminal;
acquiring the time delay of each power terminal when performing edge calculation unloading according to the transmission time delay and the calculation time delay of each power terminal;
taking the time delay of each power terminal when performing edge calculation unloading as a constraint condition, and constructing an optimization model of the edge calculation unloading energy consumption of the power terminal by taking the energy consumption minimization of each power terminal as an optimization target;
and calculating an unloading energy consumption optimization model based on the edge of the power terminal, and taking the energy consumption opposite number of the power terminal in each round of training as the reward of the corresponding intelligent agent.
3. The grid edge computing offloading distribution method of claim 2, wherein a formula of the power terminal edge computing offloading energy consumption optimization model is:
Figure 143364DEST_PATH_IMAGE001
wherein,
Figure 632114DEST_PATH_IMAGE002
represents the energy consumption of the ith power terminal when the edge calculation is unloaded at the moment t,
Figure 962601DEST_PATH_IMAGE003
the cache proportion of the task to be calculated of the ith electric power terminal at the jth electric power terminal is represented;
Figure 220407DEST_PATH_IMAGE004
for task offload actions, tablesShowing the calculation action of the task to be calculated of the ith electric power terminal on the jth electric power terminal;
the constraint conditions are as follows:
Figure 119093DEST_PATH_IMAGE005
Figure 970506DEST_PATH_IMAGE006
Figure 714471DEST_PATH_IMAGE007
Figure 92363DEST_PATH_IMAGE008
Figure 21004DEST_PATH_IMAGE009
Figure 484347DEST_PATH_IMAGE010
Figure 907369DEST_PATH_IMAGE011
Figure 139767DEST_PATH_IMAGE012
Figure 380256DEST_PATH_IMAGE013
wherein,
Figure 189949DEST_PATH_IMAGE014
indicating a network connection state between the ith power terminal and the jth power terminal,
Figure 806875DEST_PATH_IMAGE015
the time delay of the unloading and transmission of the edge calculation of the task to be calculated of the ith power terminal is shown,
Figure 628200DEST_PATH_IMAGE016
which represents a pre-set time delay threshold value,
Figure 177606DEST_PATH_IMAGE017
the task amount required to be cached in the tasks to be calculated of the ith power terminal is represented,
Figure 349961DEST_PATH_IMAGE018
representing the total cache capacity of the mobile edge compute server,
Figure 239420DEST_PATH_IMAGE019
representing the total cache capacity of any power terminal.
4. The power grid edge computing offload distribution method according to any of claims 1 to 3, wherein the sample network state information includes a network connection state, a computing capacity, a buffering capacity, a task computation amount to be buffered and offloaded, and a task transmission amount after buffering and offloading.
5. A grid edge computing offload distribution system, comprising:
the power terminal network state acquisition module is used for acquiring network state information of each power terminal in the intelligent power grid at the current moment;
the power grid edge calculation unloading distribution strategy generation module is used for inputting network state information corresponding to a target power terminal into a power grid edge calculation unloading distribution model to obtain an edge calculation unloading distribution strategy of a calculation task to be processed in the target power terminal;
the edge computing unloading module is used for segmenting the computing tasks to be processed according to the edge computing unloading distribution strategy and caching the segmented computing tasks to be processed to the corresponding power terminal and/or the mobile edge computing server so as to perform edge computing unloading on the computing tasks to be processed;
the power grid edge calculation unloading distribution model is obtained by training a multi-agent reinforcement learning network according to sample network state information and a task cache proportion and a task unloading position corresponding to the sample network state information;
the system further comprises:
the sample construction module is used for constructing sample network state information of an intelligent agent corresponding to each power terminal based on historical network state information of each power terminal, and constructing a first sample observation state according to the sample network state information;
the action construction module is used for acquiring a task cache proportion and a task unloading position corresponding to the sample network state information and constructing the action of each intelligent agent according to the task cache proportion and the task unloading position;
the intelligent agent reward building module is used for building the reward of the intelligent agent based on the energy consumption and time delay of each power terminal during the edge calculation unloading and with the energy consumption minimization of each power terminal as an optimization target;
the training set generating module is used for constructing a training sample set according to the first sample observation state, the action and the reward;
the training module is used for training the multi-agent reinforcement learning network through the training sample set to obtain a power grid edge calculation unloading distribution model;
the system is further configured to:
inputting the sample network state information into a generation countermeasure network, and outputting a second sample observation state;
updating the training sample set according to the observation state of the second sample to obtain an updated training sample set;
the training module is further configured to:
and training the multi-agent reinforcement learning network through the updated training sample set to obtain a power grid edge calculation unloading distribution model.
6. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of the grid edge computing offload distribution method according to any of claims 1 to 4.
7. A non-transitory computer readable storage medium, having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the steps of the grid edge computing offload distribution method according to any of claims 1 to 4.
CN202210255851.5A2022-03-162022-03-16Power grid edge calculation unloading distribution method and systemActiveCN114340016B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202210255851.5ACN114340016B (en)2022-03-162022-03-16Power grid edge calculation unloading distribution method and system

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202210255851.5ACN114340016B (en)2022-03-162022-03-16Power grid edge calculation unloading distribution method and system

Publications (2)

Publication NumberPublication Date
CN114340016A CN114340016A (en)2022-04-12
CN114340016Btrue CN114340016B (en)2022-07-26

Family

ID=81033893

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202210255851.5AActiveCN114340016B (en)2022-03-162022-03-16Power grid edge calculation unloading distribution method and system

Country Status (1)

CountryLink
CN (1)CN114340016B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN114928653B (en)*2022-04-192024-02-06西北工业大学 Data processing methods and devices for crowd intelligence sensing
CN114860416B (en)*2022-06-062024-04-09清华大学Distributed multi-agent detection task allocation method and device in countermeasure scene
CN115226127B (en)*2022-06-132025-09-16北京邮电大学Emergency disaster condition detection method and device
CN115396955A (en)*2022-08-242022-11-25广西电网有限责任公司 A resource allocation method and device based on deep reinforcement learning algorithm
CN115551105B (en)*2022-09-152023-08-25公诚管理咨询有限公司Task scheduling method, device and storage medium based on 5G network edge calculation
CN116634388B (en)*2023-07-262023-10-13国网冀北电力有限公司 Big data edge caching and resource scheduling method and system for power convergence network
CN116647880B (en)*2023-07-262023-10-13国网冀北电力有限公司 Base station collaborative edge computing offloading method and device for differentiated power services
CN117499491B (en)*2023-12-272024-03-26杭州海康威视数字技术股份有限公司Internet of things service arrangement method and device based on double-agent deep reinforcement learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110971706A (en)*2019-12-172020-04-07大连理工大学Approximate optimization and reinforcement learning-based task unloading method in MEC
CN111124647A (en)*2019-12-252020-05-08大连理工大学Intelligent edge calculation method in Internet of vehicles
CN113225377A (en)*2021-03-302021-08-06北京中电飞华通信有限公司Internet of things edge task unloading method and device
CN113612843A (en)*2021-08-022021-11-05吉林大学 A task offloading and resource allocation method for MEC based on deep reinforcement learning
CN113950066A (en)*2021-09-102022-01-18西安电子科技大学 Method, system and device for offloading partial computing on a single server in a mobile edge environment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110971706A (en)*2019-12-172020-04-07大连理工大学Approximate optimization and reinforcement learning-based task unloading method in MEC
CN111124647A (en)*2019-12-252020-05-08大连理工大学Intelligent edge calculation method in Internet of vehicles
CN113225377A (en)*2021-03-302021-08-06北京中电飞华通信有限公司Internet of things edge task unloading method and device
CN113612843A (en)*2021-08-022021-11-05吉林大学 A task offloading and resource allocation method for MEC based on deep reinforcement learning
CN113950066A (en)*2021-09-102022-01-18西安电子科技大学 Method, system and device for offloading partial computing on a single server in a mobile edge environment

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Joint Task Offloading and Resource Allocation for Mobile Edge Computing in Ultra-Dense Network;Zhipeng Cheng等;《 GLOBECOM 2020 - 2020 IEEE Global Communications Conference》;20210125;全文*
Multiagent DDPG-Based Joint Task Partitioning and Power Control in Fog Computing Networks;Zhipeng Cheng等;《IEEE INTERNET OF THINGS JOURNAL》;20220101;第9卷(第1期);全文*
基于MADDPG 的边缘网络任务卸载与资源管理;赵润晖等;《通信技术》;20210430;第54卷(第4期);全文*
电力SDN通信网中面向负载均衡的路由重构;刘保菊等;《北京邮电大学学报》;20200430;第43卷(第2期);全文*

Also Published As

Publication numberPublication date
CN114340016A (en)2022-04-12

Similar Documents

PublicationPublication DateTitle
CN114340016B (en)Power grid edge calculation unloading distribution method and system
CN113950066B (en) Method, system, and device for offloading part of computing from single server in mobile edge environment
CN112860350B (en) A computing offload method based on task cache in edge computing
Lu et al.Optimization of lightweight task offloading strategy for mobile edge computing based on deep reinforcement learning
CN113950103A (en)Multi-server complete computing unloading method and system under mobile edge environment
WO2024174426A1 (en)Task offloading and resource allocation method based on mobile edge computing
CN111835827A (en) IoT edge computing task offloading method and system
CN113010282B (en)Edge cloud collaborative serial task unloading method based on deep reinforcement learning
CN111405569A (en) Method and device for computing offloading and resource allocation based on deep reinforcement learning
CN111405568A (en) Method and device for computing offloading and resource allocation based on Q-learning
CN113626104B (en) Multi-objective optimization offloading strategy based on deep reinforcement learning under edge cloud architecture
CN111565380B (en) Hybrid offloading method based on NOMA-MEC in the Internet of Vehicles
CN116321293A (en) Edge Computing Offloading and Resource Allocation Method Based on Multi-agent Reinforcement Learning
CN114205353B (en) A Computational Offloading Method Based on Hybrid Action Space Reinforcement Learning Algorithm
CN116541106B (en) Calculation task offloading method, computing device and storage medium
CN117857559B (en)Metropolitan area optical network task unloading method based on average field game and edge server
CN112689296A (en)Edge calculation and cache method and system in heterogeneous IoT network
Hu et al.Dynamic task offloading in MEC-enabled IoT networks: A hybrid DDPG-D3QN approach
CN116600343A (en) A Quality of Service Optimization Method for Allocating Spectrum Resources in Mobile Edge Computing
CN119127414A (en) A new energy power optimization dispatching method and system based on digital twin platform
CN113900779A (en) Task execution method, device, electronic device and storage medium
CN116467005A (en) Distributed task offloading method, device and storage medium based on reinforcement learning
Zhang et al.Federated deep reinforcement learning for multimedia task offloading and resource allocation in MEC networks
CN113452625A (en)Deep reinforcement learning-based unloading scheduling and resource allocation method
CN118175158A (en)Multi-layer collaborative task unloading optimization method crossing local edge and cloud resource

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp