Disclosure of Invention
Based on the foregoing, it is necessary to provide a method, a device, a medium and a device for unloading the satellite-ground fusion network multitasking.
The invention adopts the following technical scheme:
the invention provides a satellite-ground fusion network multitasking method, which is applied to a satellite-ground fusion network comprising a plurality of ground users, a plurality of low-orbit satellites and a satellite cloud platform, wherein each ground user at least partially unloads locally generated multi-mode tasks to the low-orbit satellites, the satellite cloud platform is used for assisting task unloading between the ground users and the low-orbit satellites, the multi-mode tasks comprise computation intensive tasks, time delay sensitive tasks and privacy protection tasks, and the method comprises the following steps:
According to the path loss between each ground user and each low-orbit satellite and the transmitting power of each ground user, determining the time-varying channel gain and the uplink transmission rate expression when each ground user is unloaded to the low-orbit satellite under the dynamic low-orbit satellite orbit;
Taking the task unloading rate of each ground user as a variable, and determining the data transmission rate expression of the multi-mode task of each ground user according to the CPU cycle frequency of the local processing task of each ground user and the uplink transmission rate expression when unloading to the low-orbit satellite;
Aiming at a computationally intensive task, a Markov decision process is adopted through a satellite cloud platform, and actions taken under the corresponding network state are determined in a centralized manner according to the network state of a satellite-ground fusion network, and the actions are optimized by taking the maximum data transmission rate corresponding to the actions as a target, wherein the actions comprise CPU cycle frequency, transmitting power and task unloading rate of a ground user;
aiming at time delay sensitive tasks, based on the intelligent agent deployed on each ground user, adopting a distributed multi-intelligent agent depth deterministic strategy gradient, taking the local network environment faced by each ground user as the network state, determining rewards according to the data transmission rate after taking actions, and optimizing the actions of each ground user in a distributed manner
And aiming at the privacy protection type task, carrying out quantitative model analysis on the local model of each ground user for processing the privacy protection type task, uploading the quantized local model parameters to a satellite cloud platform through each low-orbit satellite, and enabling the satellite cloud platform to aggregate the quantized local model parameters through federal learning and update the local model of each ground user.
The invention provides a satellite-ground fusion network multitask unloading device, which comprises:
The system comprises a satellite cloud platform, a satellite ground fusion network, a plurality of ground users, a plurality of low-orbit satellites and a plurality of privacy protection type tasks, wherein the satellite ground fusion network is formed by the ground users, the plurality of low-orbit satellites and the satellite cloud platform;
the unloading rate determining module is used for determining time-varying channel gain and uplink transmission rate expression when each ground user is unloaded to the low-orbit satellite under the dynamic low-orbit satellite orbit according to the path loss between each ground user and each low-orbit satellite and the transmitting power of each ground user;
The transmission rate determining module is used for determining the data transmission rate expression of the multi-mode tasks of each ground user by taking the task unloading rate of each ground user as a variable and according to the CPU cycle frequency of the local processing task of each ground user and the uplink transmission rate expression when the tasks are unloaded to the low-orbit satellite;
The system comprises a dense task processing module, a data transmission module and a task unloading module, wherein the dense task processing module is used for aiming at a computation dense task, adopting a Markov decision process through a satellite cloud platform, intensively determining actions taken under corresponding network states according to the network states of a satellite-ground fusion network, and optimizing the actions by taking the maximum data transmission rate corresponding to the actions as a target;
the time delay task processing module is used for determining rewards according to the data transmission rate after taking actions by taking the local network environment faced by each ground user as the network state of the local network environment based on the intelligent agent deployed on each ground user by adopting a distributed multi-intelligent-agent depth deterministic strategy gradient, and optimizing the actions of each ground user in a distributed manner;
The privacy task processing module is used for carrying out quantitative model analysis on the local model of each ground user for processing the privacy protection task aiming at the privacy protection task, uploading the quantized local model parameters to the satellite cloud platform through each low-orbit satellite, and enabling the satellite cloud platform to aggregate the quantized local model parameters through federal learning and update the local model of each ground user.
The invention provides a computer readable storage medium storing a computer program which when executed by a processor implements the above-described satellite-to-ground fusion network multitasking offload method.
The invention provides a computer device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the satellite-ground fusion network multitasking unloading method when executing the program.
The at least one technical scheme adopted by the invention can achieve the following beneficial effects:
According to the method, firstly, the time-varying channel gain and the uplink transmission rate when each ground user is unloaded to a low-orbit satellite under the dynamic low-orbit satellite orbit are determined according to the path loss between each ground user and each low-orbit satellite and the transmitting power of each ground user, then the task unloading rate of each ground user is used as a variable, and the data transmission rate expression of the multi-modal task of each ground user is constructed, so that the data transmission rate is optimized in different modes through three multi-modal learning methods based on the requirements of different aspects of the multi-modal task, wherein the computationally intensive task is processed through a centralized Actor-Critic algorithm to globally improve the data transmission rate, the delay sensitive task is processed through a distributed multi-agent depth deterministic strategy gradient, the network delay is reduced through a distributed local optimization mode, the privacy protection task is processed through a quantized federal learning algorithm to protect privacy data, the requirements of different aspects of the multi-modal task are met, and better resource scheduling is realized.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to specific embodiments of the present invention and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Currently, a star-to-ground converged network is a promising network architecture that may help reduce ground network load pressure, provide giant access capabilities, and dense task offloading functions. However, due to the hierarchical, heterogeneous and dynamic three-dimensional features of the star-to-ground converged network, the conventional resource management method is difficult to directly apply. And, a large amount of multi-modal and multi-tasking information hampers the quality of service of the star network.
Therefore, the invention designs a multi-task fused calculation unloading model to process multi-mode network information, wherein the information comprises time-varying channel gain and dynamic low-orbit satellite orbit, which can effectively improve the data transmission rate and the privacy protection level. Further, the present invention proposes three multi-modal learning methods, a centralized Actor-Critic algorithm, a distributed multi-agent depth deterministic strategy gradient, and a quantization-based learning algorithm to handle computationally intensive, delay sensitive, and privacy-preserving tasks, which can further optimize local execution or LEO offload rates, CPU cycle frequency, and transmit power.
The following describes in detail the technical solutions provided by the embodiments of the present invention with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of a method for unloading a satellite-ground fusion network in the present invention, which specifically includes the following steps:
S101, according to the path loss between each ground user and each low-orbit satellite and the transmitting power of each ground user, determining the time-varying channel gain and the uplink transmission rate expression when each ground user is unloaded to the low-orbit satellite under the dynamic low-orbit satellite orbit.
S102, determining the data transmission rate expression of the multi-mode tasks of each ground user by taking the task unloading rate of each ground user as a variable and according to the CPU cycle frequency of the local processing tasks of each ground user and the uplink transmission rate expression when the tasks are unloaded to the low-orbit satellite.
S103, aiming at a computationally intensive task, a Markov decision process is adopted through a satellite cloud platform, actions taken in the corresponding network state are determined in a centralized mode according to the network state of the satellite-ground fusion network, and the actions are optimized by taking the maximum data transmission rate corresponding to the actions as a target, wherein the actions comprise CPU cycle frequency, transmitting power and task unloading rate of ground users.
And S104, aiming at the time delay sensitive task, adopting a distributed multi-agent depth deterministic strategy gradient based on agents deployed on each ground user, determining rewards by taking the local network environment faced by each ground user as the network state, and optimizing the actions of each ground user in a distributed manner.
S105, aiming at the privacy protection type task, carrying out quantitative model analysis on the local model of each ground user for processing the privacy protection type task, uploading the quantized local model parameters to the satellite cloud platform through each low-orbit satellite, and enabling the satellite cloud platform to aggregate the quantized local model parameters through federal learning and update the local model of each ground user.
Fig. 2 is a schematic diagram of a satellite-ground fusion network multi-mode-multi-task scene in the present invention, and as can be seen from fig. 2, a multi-task processing system is formed by a plurality of low-orbit satellites, satellite cloud platforms and ground users, which can achieve effective task arrival, unloading and processing. Wherein each ground user may offload locally generated multimodal tasks, which may include computationally intensive tasks, time-delay sensitive tasks, and privacy-preserving tasks, at least in part to low-orbit satellites, the satellite cloud platform being used to assist in task offloading between the ground user and the low-orbit satellites.
In particular, a low-orbit satellite may provide global coverage for ground users whose tasks are scaled down to multiple low-orbit satellites to achieve certain performance metrics. The set of low-orbit satellites is defined asFurther, the corresponding satellite cloud platform will help handle the ground user demand of the low orbit satellite platform, and then train the neural network parameters to accelerate the model convergence rate. Assuming that the total duration isA time slot, which is defined as. Meanwhile, the multi-modal network information may be represented as channel gains between satellites and terrestrial users, dynamic low-orbit satellite orbital positions, and complex mission types. Based on the corresponding task requirements, the task types are classified into a computationally intensive type, a delay sensitive type and a privacy protection type, and the ground user set is expressed as。
In a star-to-ground fusion network, assume that whenEach ground userGenerating computationally intensive tasks whenEach ground userGenerating time-delay sensitive tasks whenEach ground userA privacy-preserving task is generated. Generating information based on the task, and generating information on low orbit satelliteAnd ground usersThe path loss between them can be defined as:。
Wherein,For the path loss between the ground user i and the low-orbit satellite j,Is the carrier frequency of the wave,Is the speed of light, which is the speed of light,For the horizontal distance between the ground user i and the low-orbit satellite j,For the vertical distance between the ground user i and the low-orbit satellite j,For the additive path loss of the line-of-sight link between terrestrial user i and low-earth satellite j,An additive path loss for a non-line-of-sight link between a terrestrial user i and a low-earth satellite j, which, in turn,In relation to the line of sight link, it can be expressed as:。
Wherein,AndIs a constant parameter determined by a dynamic low-orbit satellite. Thus, on-ground usersAnd low orbit satellitesThe uplink transmission rate between them is expressed as:。
Wherein,For the transmit power of the terrestrial user i to the low-orbit satellite j,Is the transmission bandwidth of the packet,Is the channel noise power.
The number of bits handled locally by the surface user is expressed as:。
Wherein,The CPU cycle frequency of the processing tasks locally for each ground user,Is the interval between adjacent time slots,Is the number of CPU cycles required for each ground user to handle a 1-bit task.
For users on the groundThe invention aims to realize optimal task scheduling of a star-ground fusion network, which can maximize the number of bits processed. Furthermore, the present invention definesFor the task proportion of unloading, whereAndRepresenting the corresponding local execution and remote low orbit satellite off-loading ratios. For computationally intensive tasks, this requires high speed data transmission due to the larger task capacity. For latency sensitive tasks, this requires low orbit satellite cooperation to offload tasks and reduce network latency, as it is highly sensitive to network latency itself. For latency sensitive tasks, this requires design of an offload method based on federal learning to protect private data, since it is not willing to share private data itself. Thus, the data transfer rate for the above task can be expressed as:。
the joint optimization problem for the three tasks described above is expressed as:。
wherein the ratio of local execution to remote low orbit satellite execution offloading should be less than 1, and the local CPU cycle frequencyShould be smaller thanTransmit powerShould be smaller thanI is the total number of ground users.
FIG. 3 is a schematic diagram of a centralized Actor-Critic network architecture according to the present invention, as shown in FIG. 3, because the computationally intensive tasks are insensitive to network latency, the present invention proposes a centralized closed-loop network architecture to accommodate dynamic network conditions such as channel gain, dynamic low-rail track location, and complex task types. Since the decision set includes discrete and continuous variables, the proposed learning framework needs to handle multiple task types. However, the traditional DQN method can only handle discrete decisions, which makes it difficult to optimize continuous task strategies. Therefore, the present invention requires the introduction of policy-based offloading decisions and resource scheduling methods. In particular, the present invention contemplates a closed loop network architecture as a Markov decision process, wherein the network states include channel gainsDynamic low orbit satellite positionAnd complex task sets. And, the global state set is expressed as:, variables are generated for the tasks of the ith surface user.
Next, action setInvolving a plurality of execution actions, e.g. CPU cycle frequencyTransmit powerAnd a ratio of local execution to low orbit satellite execution、This is expressed as:。
Wherein,Can be divided into unloading ratiosAnd a resource scheduling unit. This can generate corresponding rewards when the satellite cloud platform interacts with the network environmentIs expressed as。
Based on the predefined status, actions and rewards functions described above, a set of status-actionsIs input to critic neural network and then generates a corresponding Q function. Thus, timing differential errorRepresented as。
Wherein,Is a discount factor. Next, the present invention updates critic the network parameters with the mean square loss function, which is expressed as:。
Wherein,Is the learning rate of critic networks. And, the present invention updates actor the neural network parameters, which are expressed as:。
Wherein,Is actor the learning rate of the neural network, m is the number of training samples selected.
Fig. 4 is a schematic diagram of a distributed multi-agent depth deterministic strategy gradient framework of the present invention, which includes, for delay-sensitive tasks, a current actor network, a target actor network, a current critic network, and a target critic network, as shown in fig. 4. First, each agentInteracting with dynamic network environments and collecting private-stateThen inputTo the current actor network. Assume that the current Actor network policy isThis can generate a corresponding action. The framework may then interact with the dynamic network environment and obtain rewardsAnd transition to the next state. Meanwhile, based on the obtained storage queueThe invention collects global storage queuesWhereinAnd,For the network state of the I-th agent,Actions taken for the I-th agent. Since multiple agents have a bonus function to exchange information, this allows. Thus, the present invention describes a time-sequential differential update of the current critic network, which is expressed as:。
Wherein,AndModel parameters corresponding to the target critic network and the target actor network, andAndIs the corresponding current critic network and current actor network model parameters. Further, the method comprises the steps of,Is a discount factor and the policy gradient of the current critic network is expressed as:。
Wherein,Is the learning rate of the current critic network gradient update. Since gradient updates are the deterministic strategy for the current actor network, the loss gradient is expressed as:。
To simplify the parameter update process, the present invention considers a larger Q value meaning less loss function. Therefore, the invention only needs to add a negative number to the Q value, which is expressed as:。
Finally, based on the current actor network policy described aboveAnd critic network parametersIts target critic network and actor network parameters are defined as:
,
。
Wherein,AndIs the corresponding parameter update rate sumIs the target actor network policy.
FIG. 5 is a diagram of a quantization-federal learning framework in the present invention, which requires a new learning framework designed to protect user privacy for privacy-preserving tasks because multiple ground users are reluctant to share private data. As shown in fig. 4, the present invention proposes a quantization-federal learning framework to transmit model parameters and protect data privacy. When each ground user receives a corresponding task, the set of tasks may be defined asWhereinIs each ground userTotal number of task blocks. Thus, the global task set for all ground users is defined as. Based on the task set, model loss for this federal learning:。
Wherein,Is a model parameter sum consisting of d dimensionsIs the model calculation loss. Further, the present invention divides the learning process into T-rounds and each ground user can receive global weight parameters from a remote satellite cloud platform. Thus, for any iteration roundThe weight parameters of the transmission can be expressed as:。
Wherein,Is an initial satellite cloud platform remote server parameter,Is in experience ofThe model parameters after the step of updating are updated,Is the learning rate of the gradient update. Thus, each surface user transmits updated weight parameters to the satellite cloud platform, which is expressed as:。
Wherein,Is the corresponding task weight ratio sumIs at the firstWeight parameters at round time.
This may take up a lot of storage space and energy consumption when transmitting a lot of model parameters to the satellite cloud platform. Floating point operations result in long computational delays compared to integer operations. Therefore, the present invention introduces quantization strategies to accelerate the model convergence rate. Specifically, the weight parameters transferred by each ground user are:。
Wherein the invention can design quantization functionTo a remote server. Firstly, the invention orders the weight parameters and designs the minimum valueAnd. Assume that the number of quantization bits isThe quantization interval is as follows:。
Wherein the parameter sequenceCan be defined as. Based on the above parameter sequences, each sequence index is expressed as:。
Wherein,. When a certain weight parameter falls into the parameter sequenceThe quantization function is expressed as:,。
where sgn is a sign function, w.p. denotes according to probability. The quantization model parameters are thus expressed as. Since each weight parameter needs to be passed throughQuantizing the number of bits, the total quantized bit number beingWhen transmitting model parameters to a remote server, the conversion method is calculated as:。
Wherein,,。
The invention is based on the satellite-ground fusion network multitask unloading method shown in figure 1, which comprises the steps of firstly determining the uplink transmission rate when each ground user is unloaded to a low-orbit satellite according to the path loss between each ground user and each low-orbit satellite and the transmitting power of each ground user, then constructing a communication model of the data transmission rate of the multi-modal task of each ground user by taking the task unloading rate of each ground user as a variable, and further providing three multi-modal learning methods based on the communication model, wherein the three multi-modal learning methods are used for processing computationally intensive tasks to improve the data transmission rate through a centralized Actor-Critic algorithm, processing delay sensitive tasks to reduce network delay through a distributed multi-agent depth deterministic strategy gradient, processing privacy protection tasks to protect privacy data through a quantized federal learning algorithm, meeting the requirements of different aspects of the multi-modal task, and realizing better resource scheduling.
The invention designs a multi-task fused calculation unloading model to process multi-mode network information, wherein the information comprises time-varying channel gain and dynamic low-orbit satellite orbit, which can effectively improve the data transmission rate and the privacy protection level. Three multi-modal learning methods are presented, a centralized Actor-Critic algorithm, a distributed multi-agent depth deterministic strategy gradient, and a quantization-based learning algorithm to handle computationally intensive, delay sensitive, and privacy-preserving tasks, which can further optimize local execution or LEO offload ratio, CPU cycle frequency, and transmit power.
When the satellite-ground fusion network multitasking method provided by the invention is applied, the method can be executed without the sequence of the steps shown in fig. 1, and the specific execution sequence of the steps can be determined according to the needs, so that the invention is not limited to the steps.
The above method for unloading the satellite-ground fusion network multitasking provided by one or more embodiments of the present invention is based on the same thought, and the present invention further provides a corresponding device for unloading the satellite-ground fusion network multitasking, including:
The system comprises a satellite cloud platform, a satellite ground fusion network, a plurality of ground users, a plurality of low-orbit satellites and a plurality of privacy protection type tasks, wherein the satellite ground fusion network is formed by the ground users, the plurality of low-orbit satellites and the satellite cloud platform;
the unloading rate determining module is used for determining time-varying channel gain and uplink transmission rate expression when each ground user is unloaded to the low-orbit satellite under the dynamic low-orbit satellite orbit according to the path loss between each ground user and each low-orbit satellite and the transmitting power of each ground user;
The transmission rate determining module is used for determining the data transmission rate expression of the multi-mode tasks of each ground user by taking the task unloading rate of each ground user as a variable and according to the CPU cycle frequency of the local processing task of each ground user and the uplink transmission rate expression when the tasks are unloaded to the low-orbit satellite;
The system comprises a dense task processing module, a data transmission module and a task unloading module, wherein the dense task processing module is used for aiming at a computation dense task, adopting a Markov decision process through a satellite cloud platform, intensively determining actions taken under corresponding network states according to the network states of a satellite-ground fusion network, and optimizing the actions by taking the maximum data transmission rate corresponding to the actions as a target;
the time delay task processing module is used for determining rewards according to the data transmission rate after taking actions by taking the local network environment faced by each ground user as the network state of the local network environment based on the intelligent agent deployed on each ground user by adopting a distributed multi-intelligent-agent depth deterministic strategy gradient, and optimizing the actions of each ground user in a distributed manner;
The privacy task processing module is used for carrying out quantitative model analysis on the local model of each ground user for processing the privacy protection task aiming at the privacy protection task, uploading the quantized local model parameters to the satellite cloud platform through each low-orbit satellite, and enabling the satellite cloud platform to aggregate the quantized local model parameters through federal learning and update the local model of each ground user.
For specific limitation of the star-to-ground fusion network multiplexing offloading device, reference may be made to the limitation of the star-to-ground fusion network multiplexing offloading method hereinabove, and the description thereof will not be repeated here. The above-mentioned all modules in the star-ground fusion network multitasking offload device can be implemented in whole or in part by software, hardware and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
The present invention also provides a computer readable storage medium storing a computer program operable to perform the above-described method for offloading satellite-to-ground fusion network multitasking provided in fig. 1.
The invention also provides a computer device, which comprises a processor, an internal bus, a network interface, a memory and a nonvolatile memory, and can also comprise hardware needed by other services. The processor reads the corresponding computer program from the nonvolatile memory to the memory and then runs the computer program to realize the satellite-ground fusion network multitasking unloading method provided by the figure 1.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, or the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory. By way of illustration, and not limitation, RAM can be in various forms such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), etc.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the present invention.