Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
For ease of understanding, the following simple explanation of partial nouns is first made:
1. Blockchain in narrow sense, the blockchain is a chain type data structure taking a block as a basic unit, the block uses a digital abstract to verify the transaction history acquired before, and is suitable for the requirements of tamper resistance and expandability in a distributed billing scene, and in broad sense, the blockchain also refers to a distributed billing technology realized by the blockchain structure, including distributed consensus, privacy and security protection, point-to-point communication technology, network protocol, intelligent contract and the like. The goal of the blockchain is to implement a distributed data logging ledger that allows only additions and not deletions. The basic structure of the ledger floor is a linear linked list. The linked list is formed by serially connecting blocks, the Hash value of the preceding block is recorded in the following blocks, and whether each block (and the transaction in the block) is legal or not can be rapidly checked by calculating the Hash value. If a node in the network proposes to add a new block, a consensus acknowledgement must be made for the block via a consensus mechanism.
2. The block is a data packet carrying transaction data on a block chain network, is a data structure marked with a time stamp and a hash value corresponding to a preceding block, and verifies and confirms the transaction in the block through a common recognition mechanism of the network. The Block includes a Block Header (Block Header) and a Block Body (Block Body), where the Block Header can record meta information of the current Block and includes data such as a current version number, a hash value corresponding to a previous Block, a timestamp, a random number, a hash value of a Merkle Root (Merkle Root), and the like. The block may record detailed data generated over a period of time, including all transaction records or other information generated during the creation of the block for which the current block is verified, and may be understood as a representation of the ledger.
2. Hash values, also known as eigenvalues or information eigenvalues, are generated by a hash algorithm that converts input data of arbitrary length into a password and performs a fixed output, and the original input data cannot be retrieved by decrypting the hash value, which is a one-way encryption function. In the blockchain, each block (except the initial block) contains the hash value of the successor block, which is referred to as the parent block of the current block. Hash value is the potential core foundation and most important aspect in blockchain technology, which preserves the authenticity of the recorded and viewed data, as well as the integrity of the blockchain as a whole.
3. Blockchain nodes blockchain networks differentiate blockchain links points into consensus nodes (which may also be referred to as core nodes and full-scale nodes), and synchronization nodes (which may include data nodes as well as light nodes). The synchronous node is responsible for synchronizing account book information of the common node, namely synchronizing the latest block data.
4. The algorithm includes two keys, a public key and a private key (PRIVATE KEY). The public key and the private key are a pair, and if the data is signed by the private key, the signature can be checked only by the corresponding public key. Because the signing process and the verification process use two different keys, respectively, such an algorithm is referred to as an asymmetric signature. The basic process of realizing the secret information exchange by the asymmetric signature can be that a first party generates a pair of secret keys and discloses public keys, when the first party needs to send information to other roles (a second party), the private keys of the first party are used for signing the secret information and then sending the secret information to the second party, and the second party uses the public keys of the first party for signing the signed information.
5. Federal machine learning, also known as federal learning, joint learning, federal learning. Federal machine learning is a machine learning framework that can effectively help multiple institutions perform data usage and machine learning modeling while meeting the requirements of user privacy protection, data security, and government regulations. The federal learning is used as a distributed machine learning paradigm, so that the problem of data island can be effectively solved, participants can jointly model on the basis of not sharing data, and the data island can be broken technically, so that the privacy of users is protected as much as possible.
6. Artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is the theory, method, technique, and application system that simulates, extends, and extends human intelligence using a digital computer or a machine controlled by a digital computer, perceives the environment, obtains knowledge, and uses the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
7. The deep reinforcement learning combines the perception capability of the deep learning and the decision capability of the reinforcement learning, can be directly controlled according to the input data, and is an artificial intelligence method which is closer to the human thinking mode.
8. Edge computing is a novel computing example that utilizes computing resources near the internet of things device to provide services in a timely manner. The edge system is often connected with the Internet of things and the cloud center, and is complementary with the traditional cloud computing. Edge computation is also understood to be a distributed computing architecture in which the computation of applications, data and services is handled by hub nodes, moving to edge nodes on the network logic, or the edge computation breaks up the large services that would otherwise be handled by the hub nodes entirely, cutting into smaller and more manageable parts, and distributing to the edge nodes for processing.
9. Task offloading refers to transmitting a task of a user terminal to an edge server or a cloud server closer to the user terminal for offloading in an edge environment including a cloud (cloud server), an edge server and the user terminal, so as to meet requirements of users for performance, time delay and energy consumption.
The scheme provided by the embodiment of the application is specifically described by the following embodiment in combination with techniques such as machine learning, deep learning and the like in artificial intelligence and a blockchain technique.
Referring to fig. 1a, fig. 1a is a schematic diagram of a system architecture according to an embodiment of the application. As shown in fig. 1a, the system architecture schematic may include a cloud server cluster 10a, an edge server cluster 10b, and a user terminal cluster, where the cloud server cluster 10a may include cloud servers 101a, a. Wherein edge server cluster 10b may include edge servers 101b, 102b, edge servers 103b, it will be appreciated that edge server cluster 10b described above may include one or more edge servers, and the number of edge servers will not be limited herein. Wherein the user terminal cluster may comprise user terminal 101c, user terminal 102c, user terminal 103c, it will be appreciated that the user terminal cluster may comprise one or more user terminals, the number of which will not be limited here.
Wherein a communication connection may exist between the user terminal clusters, for example, a communication connection exists between the user terminal 101c and the user terminal 102c, and a communication connection exists between the user terminal 102c and the user terminal 103 c. Any user terminal in the user terminal cluster may have a communication connection with any edge server in the edge server cluster 10b, for example, a communication connection between the user terminal 101c and the edge server 101b, a communication connection between the user terminal 102c and the edge server 102b, and a communication connection between the user terminal 103c and the edge server 102 b. Any user terminal in the user terminal cluster may have a communication connection with any cloud server in the cloud server cluster 10a, for example, a communication connection exists between the user terminal 101c and the cloud server 101a, a communication connection exists between the user terminal 102c and the cloud server 101a, and a communication connection exists between the user terminal 103c and the cloud server 102 a.
Wherein a communication connection may exist between edge server clusters 10b, for example, between edge server 101b and edge server 102b, and between edge server 101b and edge server 103 b. Any edge server in the edge server cluster 10b may have a communication connection with any cloud server in the cloud server cluster 10a, for example, an edge server 101b and a cloud server 101a, an edge server 103b and a cloud server 102a, and an edge server 102b and a cloud server 101 a.
Among these, there may be a communication connection between cloud server clusters 10a, for example, a communication connection between cloud server 101a and cloud server 102 a.
It should be understood that the above communication connection is not limited to a connection manner, and may be directly or indirectly connected through a wired communication manner, may be directly or indirectly connected through a wireless communication manner, and may also be connected through other connection manners, which is not limited herein.
It should be noted that, the cloud servers (such as the cloud server 101a, the..the cloud server 102a, and the edge servers (such as the edge servers 101b, 102b, and 103 b) in the cloud server cluster 10a, the edge servers (such as the user terminal 101c, the user terminal 102c, and the user terminal 103 c) in the user terminal cluster, and the edge servers 10b in the cloud server cluster 10a may be blockchain nodes in the blockchain network, and referring to fig. 1b, fig. 1b is a schematic diagram of a network architecture provided by an embodiment of the present application, and the system architecture in fig. 1a may include the blockchain network 10 in fig. 1 b. As shown in fig. 1b, the blockchain network 10 may include a synchronous network 10e and a consensus network 10d (which may be understood as a blockchain consensus network), where nodes in the synchronous network 10e may be called light nodes, possess partial data, the light nodes perform service execution mainly, do not participate in accounting consensus, and obtain block header data and partial authorization visible block data from the consensus network 10d by means of identity authentication, in this embodiment, an edge server in the edge server cluster 10b in fig. 1a, and a user terminal in the user terminal cluster are regarded as light nodes, such as the edge server 101b and the edge server 102b shown in fig. 1b, and the user terminal 101c and the user terminal 103c. The consensus network 10d may also be referred to as a core network, and the nodes in the consensus network 10d may be referred to as full-scale nodes, where the full-scale nodes have full-scale data, and in the embodiment of the present application, cloud servers in the cloud server cluster are used as full-scale nodes, such as the cloud server 101a, the cloud server 102a, the cloud server 103a, and the cloud server 104a shown in fig. 1 b. The synchronous network 10e and the consensus network 10d are in different network environments, typically the synchronous network 10e is in a public network and the consensus network 10d is in a private network, both interacting through a routing boundary.
It will be appreciated that the synchronous network 10e described above may include one or more light nodes, the number of which will not be limited here. The consensus network 10d may include one or more full-scale nodes, the number of which will not be limited here.
Each blockchain node (including the light nodes in the synchronous network 10e and the full-scale nodes in the consensus network 10 d) can receive data sent from the outside during normal operation, perform blockchain processing based on the received data, and also can send data to the outside. In order to ensure data interworking between the various blockchain nodes, there may be a data connection between each blockchain node that may be implemented by the communication connection function described above.
It will be appreciated that data or block transfer may be performed between the blockchain nodes via the data connections described above. The data connection between the blockchain nodes may be based on node identification, for each blockchain node in the blockchain network 10, having a node identification corresponding thereto, and each of the above-mentioned blockchain nodes may store node identifications of other blockchain nodes having a connection relationship with itself, so that the acquired data or generated blocks may be subsequently broadcast to other blockchain nodes according to the node identifications of the other blockchain nodes, for example, a node identification list may be maintained in the user terminal 101c (light node), which stores node names and node identifications of the other blockchain nodes. As shown in table 1:
TABLE 1
| Node name | Node identification |
| Cloud server 102a | 117.114.151.174 |
| Cloud server 103a | 117.116.189.145 |
| ... | ... |
| Cloud server 104a | 119.123.789.258 |
| Edge server 101b | 117.114.151.183 |
| Edge server 102b | 117.116.189.125 |
| User terminal 103c | 119.250.485.362 |
| User terminal 101c | 119.123.789.369 |
The node identifier may be any protocol (Internet Protocol, IP) address of the interconnection between networks, and any other information that can be used to identify the blockchain node in the blockchain network 10, and the IP address is only illustrated in table 1.
The user terminal 101c may send a data processing request (may include service offloading and data transmission) to the edge server 101b through the node identifier 117.114.151.183, and the edge server 101b may know that the data processing request is sent by the user terminal 101c through the node identifier 119.123.789.369, and similarly, the user terminal 101c may send the transaction data a to the cloud server 103a through the node identifier 117.116.189.145, and the cloud server 103a may know that the transaction data a is sent by the user terminal 101c through the node identifier 119.123.789.369, and the communication connection between other blockchain nodes is not repeated.
Wherein, cloud server 101a, cloud server 102a, cloud server 103a, cloud server 104a, edge server 101b, edge server 102b, user terminal 103c, user terminal 101c in fig. 1b may include a cell phone, tablet, notebook, palm computer, smart stereo, mobile internet device (MID, mobile INTERNET DEVICE), POS (Point Of sale) machine, wearable device (e.g., smart watch, smart bracelet, etc.), etc.
It will be appreciated that the data processing method provided by the embodiment of the present application may be performed by a computer device, including but not limited to a full-scale node (cloud server) or a light node (user terminal and edge server). The servers (including cloud servers and edge servers) may be independent physical servers, may be server clusters or distributed systems formed by a plurality of physical servers, and may also be cloud servers providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligence platforms, and the like. The user terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The user terminal and the server (including the cloud server and the edge server) may be directly or indirectly connected through a wired or wireless communication manner, and the present application is not limited herein.
Further, referring to fig. 2a, fig. 2a is a schematic view of a data processing scenario according to an embodiment of the present application. In the embodiment of the present application, an initial task decision model is trained by a distributed learning algorithm, so as to obtain a trained task decision model (computing resources for deciding on execution of tasks to be offloaded), and it is assumed that 5 user terminals train the initial task decision model locally according to local training data, respectively, as shown in fig. 2a, user terminal 101c trains initial task decision model 201a locally by using local 10 training data, user terminal 102c trains initial task decision model 202a locally by using local 15 training data, user terminal 103c trains initial task decision model 203a locally by using local 20 training data, user terminal 104c trains initial task decision model 204a locally by using local 15 training data, and user terminal 105c trains initial task decision model 205a locally by using local 20 training data. It can be understood that, because the training data corresponding to each user terminal is different, the model parameters in the same initial task decision model may be different, and the initial task decision model may represent an unconverged model or a model with the iteration number not equal to the preset iteration number, that is, an untrained model.
The user terminals 101c, 102c, 103c, 104c and 105c may be user terminals in the user terminal cluster in fig. 1a, and similarly may be light nodes in the synchronization network 10e in fig. 1 b.
Each user terminal can preset target iteration times in advance, for example, the target iteration times are equal to 50, 100 and 150, or preset a trigger mechanism, for example, local sub-model parameters are acquired every 50 iteration times.
Referring back to fig. 2a, when the number of iterations is equal to 50, the user terminal 101c obtains the sub-model parameters of the initial task decision model 201a, generates the information chain 201b according to the sub-model parameters of the initial task decision model 201a, when the number of iterations is equal to 50, the user terminal 102c obtains the sub-model parameters of the initial task decision model 202a, generates the information chain 202b according to the sub-model parameters of the initial task decision model 202a, when the number of iterations is equal to 50, the user terminal 103c obtains the sub-model parameters of the initial task decision model 203a, generates the information chain 203b according to the sub-model parameters of the initial task decision model 203a, when the number of iterations is equal to 50, the user terminal 104c obtains the sub-model parameters of the initial task decision model 204a, generates the information chain 204b according to the sub-model parameters of the initial task decision model 204a, and when the number of iterations is equal to 49, the user terminal 105c obtains the sub-model parameters of the initial task decision model 205a, generates the information chain 205b according to the sub-model parameters of the initial task decision model 205 a. It will be appreciated that the sub-model parameters described above may include all parameters in an initial task decision model.
The embodiment of the present application is only exemplified by the user terminal 101c generating the information chain 201b including the sub-model parameters of the initial task decision model 201a, and the process of generating the local information chain by other user terminals can be described as follows. The user terminal 101c may obtain a time stamp, where the time stamp may represent a time stamp of iterating the initial task decision model 201a to 50 times, may represent a time stamp of obtaining sub-model parameters of the initial task decision model 201a, and the meaning of the time stamp is not limited herein, and may be set according to an actual application scenario. The user terminal 101c may obtain a public key of its public-private key pair, and may obtain a public key hash of its public key by using a hash algorithm, and determine the public key hash, the sub-model parameters of the initial task decision model 201a, the iteration number (i.e. 50 times), and the timestamp as data to be signed. The user terminal 101c obtains the digital abstract of the data to be signed, encrypts the digital abstract according to the private key in the public and private keys of the user terminal 101c to obtain the digital signature, and the user terminal 101c generates an information chain 201a according to the digital signature, the public key hash, the sub-model parameters of the initial task decision model 201a, the iteration times (i.e. 50 times) and the time stamp.
The task unloading scene in the application can be applied to an edge environment, so that an information chain transmitted during training of an initial task decision model can be transmitted to a cloud server through an edge server in the edge environment, please refer to fig. 2a again, a user terminal 101c transmits the information chain 201b to the edge server 101b, after the edge server 101b receives the information chain 201b, the information chain 201b can be checked, firstly, data to be verified carried in the information chain 201b is acquired, then a digital digest corresponding to the data to be verified is acquired, a public key of the user terminal 101c is utilized to decrypt a digital signature in the information chain 201b, the digital digest corresponding to the digital signature and the digital digest corresponding to the data to be verified are acquired, if the two are identical, the checking is passed, the checking is capable of indicating that the information chain 201b is not tampered, if the two are different, the checking is failed, the information chain 201b can be indicated to be tampered, the data to be verified is carried in the information chain is identical to the data to be verified in the description, the information chain 201b can be tampered, the data to be verified is not carried in the information chain is identical, and the user terminal node 201b can be verified by the user node, and the user terminal node is not indicated to be authenticated by the private node, and the user node is not able to verify the data.
If the signature verification fails, the edge server 101b discards the information chain 201b, and if the signature verification passes, the edge server 101b uploads the information chain 201b to the cloud server, such as the cloud server 101a shown in fig. 2 a. It can be understood that after the information chain is generated, the information chain is uploaded to the edge server or the cloud server, if the information chain is uploaded to the edge server, in order to reduce communication consumption, the information chain 201b is generally uploaded to the edge server with the highest communication intensity once, when network fluctuation occurs and the cloud server cannot be connected, the edge server can be transferred to other edge servers, the other edge servers still execute a label verification process, and after the label verification process is passed, the information chain 201b is transferred to the cloud server.
The edge server 101b, the edge server 102b described below, and the edge server 103b may be edge servers in the edge server cluster 10b in fig. 1a, and similarly may be light nodes in the synchronization network 10e in fig. 1 b. Cloud server 101a may be a cloud server in cloud server cluster 10a in fig. 1a, and similarly may be a full number of nodes in co-located network 10d in fig. 1 b. It will be understood that fig. 2a only illustrates 5 user terminals, 3 edge servers and 1 cloud server, and the user terminals, the edge servers and the cloud servers may be any number in practical application.
Referring to fig. 2a again, the user terminal 102c transmits the information chain 202b to the cloud server 101a, the cloud server 101a receives the information chain 202b and then performs signature verification on the information chain 202b, the user terminal 103c transmits the information chain 203b to the edge server 101b, after the information chain 101b passes the signature verification, the edge server 101b cannot connect to the cloud server due to network fluctuation, and therefore is transmitted to the edge server 102b, the edge server 102b receives the information chain 203b and then performs signature verification on the information chain 203b, the user terminal 104c transmits the information chain 204b to the edge server 102b, the edge server 102b receives the information chain 204b and then performs signature verification on the information chain 204b, the user terminal 105c transmits the information chain 205b to the edge server 103b, and then the edge server 103b receives the information chain 205b, which is referred to the signature verification process.
Referring to fig. 2a again, the result of the signature verification by the edge server 101b for the information chain 201b is that the signature verification passes, and the result of the signature verification by the edge server 102b for the information chain 202b is that the signature verification passes, and the result of the signature verification passes is that the signature verification passes to the cloud server 101a, the result of the signature verification by the edge server 102b for the information chain 204b is that the signature verification fails, and the information chain 204b is deleted, where before the signature verification process is executed, the edge server and the cloud server may also compare the iteration number in the information chain with the legal iteration number, for example, the edge server 103b compares the iteration number (49) in the information chain 205b with the legal iteration number (50), which is obviously different, at this time, the information chain 205b may be determined to be an illegal information chain, and the edge server 103b may not test the signature of the information chain 205b, and discard, i.e. delete the information chain 205 b.
In summary, the cloud server 101a obtains 3 information chains to be verified, which are the information chain 201a provided by the user terminal 101c, the information chain 202a provided by the user terminal 102c, and the information chain 203a provided by the user terminal 103 c. Subsequently, the cloud server 101a performs verification processing on the 3 information chains to be verified respectively, determines the information chains to be verified, the verification results of which meet the quality evaluation conditions, as target information chains, then performs model decision quality evaluation on sub-model parameters in the target information chains, the cloud server 101a determines the sub-model parameters, the quality evaluation results of which meet the model consensus conditions, as target sub-model parameters, performs aggregation processing on the target sub-model parameters to obtain central model parameters, performs model decision quality evaluation on the central model parameters to obtain target quality evaluation results, and the cloud server 101a generates target blocks according to the central model parameters and the target quality evaluation results, and when the target blocks pass through the blockchain consensus, broadcasts the target blocks to blockchain nodes in a blockchain network (equivalent to the blockchain network 10 in fig. 1 b) respectively, wherein the blockchain nodes can comprise the user terminal 101c, the user terminal 102c, the user terminal 103c, the user terminal 104c, the user terminal 105c, the edge server 101b and the edge server 103b shown in fig. 2 a. And each block chain node respectively performs validity verification on the target block according to the target quality evaluation result, and when the result of the validity verification indicates that the target block is a valid block, a task decision model containing central model parameters is obtained. It will be appreciated that the task decision model may be a task decision model that each user terminal has trained next when the task decision model has not converged or the current number of iterations has not reached a maximum number of iterations, or may represent a trained task decision model when the task decision model has converged or the current number of iterations has reached a maximum number of iterations, at which time each user terminal may decide a target computing resource for task offloading, such as an edge server or cloud server, based on the task decision model.
The specific processing procedure of the cloud server 101a to be verified information chain and the specific process of generating the target block are described in the embodiments corresponding to fig. 3 and fig. 7, respectively, and are not described herein.
The application can be used in an edge computing environment with a user terminal (such as an intelligent device such as an internet of things device, a mobile phone, an unmanned aerial vehicle group, an unmanned aerial vehicle and the like), an edge terminal (such as a base station, an edge server close to the user terminal and the like) and a cloud terminal (such as a large server and a cloud center and the like), and the computing pressure of the user terminal is relieved by offloading computing tasks generated by the user terminal to the edge terminal or the cloud terminal under the environment. The method is suitable for various application environments capable of embedding edge computing, such as intelligent Internet of things, unmanned aerial vehicle groups, unmanned aerial vehicles and the like.
Different from the existing edge unloading scheme, the embodiment of the application enhances the network fluctuation resistance of the task decision model by placing the task unloading decision algorithm at the user terminal, increases the training speed of the model by introducing a federal learning and other distributed algorithms, and improves the safety and the high efficiency of information transmission by introducing a blockchain as an information transmission format. Referring to fig. 2b, fig. 2b is a schematic view of a data processing scenario according to an embodiment of the present application. As shown in fig. 2b, when the user terminal 101c obtains the task 20e locally and the calculation amount of the task 20e is large, if the user terminal 101c locally executes the task 20e, high latency and high power consumption will occur, and at this time, the user terminal 101c may use the task decision model 20d to decide a target server for executing the task 20e in the edge environment 20f, and as shown in fig. 2b, the edge environment 20f may include an edge server 101b, an edge server 102b, a cloud server 101a and a cloud server 102a.
Referring back to fig. 2b, according to the task decision model 20d, the user terminal 101c determines that the target server is the cloud server 101a, so the user terminal 101c may offload the task 20e to the cloud server 101a, i.e. the cloud server 101a performs the task 20e.
In order to improve the network fluctuation resistance of the task decision model, in the embodiment of the application, a task offloading decision algorithm of the model is placed in the user terminal, so that the situation that an offloading decision server (i.e. a server for installing the task offloading decision algorithm) cannot be connected due to sudden network disconnection can be prevented. The task decision model in the application adopts a deep reinforcement learning algorithm, wherein the deep reinforcement learning algorithm is a reinforcement learning algorithm based on a value function and using a deep neural network, please refer to fig. 2c, and fig. 2c is a schematic diagram of a task decision model according to an embodiment of the application. As shown in fig. 2c, the architecture first regards the task as a markov process, regards the network model, the time delay model, the energy consumption model and the economic model in the actual situation as states in the task decision model, then constructs the reward function and the unloading action in reinforcement learning, carries out reinforcement learning iteration on the weight parameters through the optimization target, and learns the optimal unloading scheme of the computing task of the user under different environments. FIG. 2c shows a modified deep reinforcement learning algorithm, a deep parallel task decision model learning algorithm. In order to enhance the training speed of the model and the effect of the decision result, the algorithm searches the optimal rewarding function and unloading action through a plurality of neural networks (namely task decision models) with the same structure and different parameters in parallel, so that the decision capability of the whole model is improved.
The updating iterative mode of the task unloading decision algorithm (which can be understood as a task decision model) is a reinforcement learning algorithm adopting time sequence differential learning of different strategies. In reinforcement learning, the estimation method of the cost function is formula (1):
Q(s,a)←Q(s,a)+α(r+γmaxa′Q(s′,a′)-Q(s,a)) (1)
Equivalent to let Q (s, a) directly estimate the optimal state value function Q*(s,a),Q* (s, a) is equivalent to Q (s, a) on the left side in equation (1).
Where s represents the current state, a represents the current state, s 'represents the next state, a' represents the next state after update, r represents the reward, γ represents the weight parameter, and Q (s, a) represents the cost function.
Turning briefly to the flow in fig. 2c, the task flow in fig. 2c is equivalent to the workflow, and may represent test data when contact 1 contacts the first contact on the left and the second contact (i.e., the upper two contacts), when contact 2 contacts the test contact, and may represent training data when contact 1 contacts the lower two contacts on the left (i.e., the second contact and the third contact), when contact 2 contacts the training contact. The process of training the x task decision models by the ue is consistent with the process of training the task decision model 201a by the ue 101c in fig. 2a, and after training is completed once, a set of tuples is obtained, and the ue stores the tuples in the memory pool. When the task decision model is applied by the user terminal, x decision results can be obtained through x parallel task decision models, such as a task decision model 1, a task decision model 2, a task decision model x illustrated in fig. 2c, and a decision result 1, a decision result 2, a task decision result x illustrated in fig. 2c, and the user terminal can determine an optimal decision result from the x decision results. Wherein the task flow can be regarded as the current state in deep reinforcement learning, and the next state (i.e., new state) can be generated according to the decision result.
It can be appreciated that the task decision model may be any deep neural network model, and the embodiment of the present application does not limit the task decision model. It will be appreciated that the distributed learning algorithm in the embodiments of the present application may be a federal learning algorithm.
2 A-2 c, the present application mainly constructs a new intelligent task offloading framework to cope with task offloading decisions in the edge environment. The application aims at a task unloading decision algorithm, adopts an intelligent algorithm based on deep reinforcement learning, and unlike other intelligent algorithms, the biggest innovation of the application is to improve the safety and the high efficiency of the training and the transmission process of the task unloading decision algorithm (which is equivalent to the task decision model described above) by introducing block chain and federal learning. The parameters (such as sub-model parameters and center model parameters) in federal learning are uploaded and downloaded in a block chain mode, so that the irreproducibility in the parameter transmission process is ensured, and the model can effectively filter garbage parameters (namely illegal parameters) when the target sub-model parameters are aggregated by innovating a new consensus algorithm (namely a mode of carrying out model decision quality evaluation on the sub-model parameters), so that the high efficiency and stability of the whole training process are further ensured.
Further, referring to fig. 3, fig. 3 is a flow chart of a data processing method according to an embodiment of the application. The data processing method may be executed by a computer device, and in the embodiment of the present application, the computer device is taken as a central node as an example, where the central node may be any one cloud server in fig. 1a (equivalent to any one full-scale node in fig. 1 b). As shown in fig. 3, the data processing process may include the following steps.
Step S101, N target information chains are obtained, N is a positive integer, each of the N target information chains comprises sub-model parameters, and the N sub-model parameters are provided by N user nodes respectively.
Specifically, in the embodiment of the present application, it is assumed that n=3, that is, the central node acquires 3 target information chains, and the user nodes corresponding to the 3 target information chains respectively may be user terminals in the user terminal cluster in fig. 1a, which is equivalent to the light node (the blockchain node corresponding to the user terminal) in the synchronization network 10e in fig. 1 b. Referring to fig. 4a, fig. 4a is a schematic view of a data processing scenario according to an embodiment of the present application. As shown in fig. 4a, the user node 401c provides a target information chain 401b, the target information chain 401b contains the submodel parameter D1, the user node 402c provides a target information chain 402b, the target information chain 402b contains the submodel parameter D2, the user node 403c provides a target information chain 403b, and the target information chain 403b contains the submodel parameter D3.
It may be appreciated that in the edge-buffer environment, the federal learning algorithm may be used to enable each user node to train the task decision model locally, train the model for a certain number of times, and then transmit the sub-model parameters to the central node, where the user node may transmit the sub-model parameters to the edge node first, transmit the sub-model parameters to the central node through the edge node (for example, the central node 40a illustrated in fig. 4 a), and also directly transmit the sub-model parameters to the central node. The edge nodes may be edge servers in the edge server cluster 10b in fig. 1a, which is equivalent to the light nodes (blockchain nodes corresponding to edge servers) in the synchronization network 10e in fig. 1 b.
And S102, respectively performing model decision quality evaluation on the N sub-model parameters, and determining the sub-model parameters with the quality evaluation result meeting the model consensus condition as target sub-model parameters, wherein the total number of the target sub-model parameters is smaller than or equal to N.
Specifically, the N sub-model parameters comprise sub-model parameters Di, i is a positive integer and i is smaller than or equal to N, a simulation task is acquired, a task decision sub-model Mi containing the sub-model parameters Di is acquired, the simulation task is input into the task decision sub-model Mi, a task decision node aiming at the simulation task and output by the task decision sub-model Mi is acquired, task decision loss is acquired according to the task decision node and the simulation task, task environment information is acquired, and model decision quality assessment is performed on the sub-model parameters Di according to the task decision loss and the task environment information to obtain a quality assessment result corresponding to the sub-model parameters Di.
The task decision loss comprises task decision delay loss and task decision energy loss, task environment information comprises task information and environment information, the task information is used for representing basic information of a simulation task, the environment information is used for representing basic information of a task decision node, the specific process of carrying out model decision quality assessment on the sub-model parameters Di according to the task decision loss and the task environment information can comprise the steps of carrying out normalization processing on the task decision delay loss to obtain the unitized task decision delay loss, carrying out normalization processing on the task decision energy loss to obtain the unitized task decision energy loss, carrying out weighting summation processing on the unitized task decision delay loss and the unitized task decision energy loss to obtain the unitized task decision loss, carrying out normalization processing on the task information to obtain the unitized task information, carrying out normalization processing on the environment information to obtain the unitized environment information, carrying out summation processing on the unitized task information and the unitized environment information to obtain the unitized task environment information, and carrying out model assessment on the model quality assessment on the sub-model parameters Di according to the unitized task decision loss and the unitized environment information.
The central node inputs 1 sub-model parameter into the local model, randomly generates a task, makes a decision on the randomly generated task according to the local model, and can evaluate the unloading level of the sub-model parameter according to the decision result. Referring again to fig. 4a, the central node 40a obtains a simulation task, obtains a task decision sub-model M1 including a sub-model parameter D1, inputs the simulation task into the task decision sub-model M1, and may obtain a task decision node for the simulation task output by the task decision sub-model M1, as illustrated by an edge node 401D in fig. 4a, it may be understood that the edge node 401D may be an edge server in fig. 1 a.
The central node 40a obtains basic information of the simulation task, which may include information such as a calculation amount required when the simulation task is executed and a task size of the simulation task itself, and takes the basic information of the simulation task as task information, and the central node 40a obtains basic information of the task decision node, which may include information such as communication information between the user node 401c and the edge server 401a and information such as a calculation capability of the edge server 401a, and takes the basic information of the task decision node as environment information. It can be understood that the task information and the environment information can be set according to the actual application scenario, and the content of the task environment information is not limited in the embodiment of the application.
It will be understood that the central node 40a does not actually offload the analog task to the edge server 401a, and does not cause the user terminal 401c to offload the analog task to the edge server 401a, the central node 40a simply offload the analog task from the user node 401c to the edge node 401d according to the environmental information and the task information, and perform the execution of the analog task by the edge node 401d, and may further include a process of returning the execution result by the edge node 401d, so as to calculate the task decision loss generated by task offload. The task decision loss includes a task decision delay loss and a task decision energy loss, where the task decision delay loss may include a time when the task decision sub-model M1 decides the edge node 401d in the edge environment, may include a time when data (including a simulation task and a task execution result) is transmitted between the edge node 401d and the user node 401c, and may include a time when the edge node 401d executes the simulation task. The task decision energy consumption loss may include energy consumption generated by the task decision sub-model M1 in deciding the edge node 401d in the edge environment, may include energy consumption generated by transmitting data (may include a simulation task and a task execution result) between the edge node 401d and the user node 401c, and may include energy consumption generated by the edge node 401d executing the simulation task. It can be understood that the task decision delay loss and the task decision energy consumption loss can be set according to actual application scenarios, which are not limited by the embodiment of the present application.
Different evaluation indexes often have different dimensions and dimension units, and the situation can influence the result of data analysis, so that in order to eliminate the dimension influence among indexes, data normalization processing is needed to solve the comparability among the data indexes. After the original data is subjected to data normalization processing, all indexes are in the same order of magnitude, and the method is suitable for comprehensive comparison and evaluation. Obviously, the dimensions corresponding to the task decision delay loss and the task decision energy loss are not right, the dimension of the task decision delay loss is time, and the dimension of the task decision energy loss is not time, so that before the two are weighted and summed, normalization processing is required to be carried out on the task decision delay loss and the task decision energy loss respectively. Similarly, the dimensions corresponding to the task information and the environment information are different, and therefore, normalization processing is also required for the task information and the environment information.
Referring to fig. 4b, fig. 4b is a flow chart of a data processing method according to an embodiment of the application. As shown in fig. 4b, the central node performs normalization processing on the task decision delay loss, the task decision energy consumption loss, the task information and the environmental information, so as to obtain unitized task decision delay loss, unitized task decision energy loss, unitized task information and unitized environmental information correspondingly, performs weighted summation processing on the unitized task decision delay loss and the unitized task decision energy loss to obtain unitized task decision loss, performs summation processing on the unitized task information and the unitized environmental information to obtain unitized task environmental information, and the central node can pass through a formula (2),
And calculating the unitized task decision loss and unitized task environment information, namely performing model decision quality assessment on the submodel parameter D1, and further obtaining a quality assessment result corresponding to the submodel parameter D1. A in the formula (2) represents the unitized task decision delay loss, B represents the unitized task decision energy loss, C represents the unitized task information, D represents the unitized environment information, E represents the quality evaluation result which is similar to the score, namely, the submodel parameter D1 is scored according to the task unloading level of the submodel parameter D1.
The central node can be regarded as a full-quantity node in the blockchain network and has consensus authority, in the embodiment of the application, each user node is regarded as a light node, an information chain provided by the user node can be regarded as a transaction, and aiming at a task unloading decision scene, the embodiment of the application provides a novel consensus algorithm according to a formula (2), namely model decision quality evaluation is carried out on sub-model parameters, and a consensus result is determined according to a quality evaluation result. In the process of determining the target sub-model parameters according to the quality evaluation results corresponding to the sub-model parameters, please refer to the description of step S202 in the embodiment corresponding to fig. 7 below, which is not described herein.
Step S103, performing aggregation processing on the target sub-model parameters to obtain center model parameters, and performing model decision quality assessment on the center model parameters to obtain a target quality assessment result.
Specifically, the target submodel parameters comprise A target submodel parameters, A is a positive integer and A is smaller than or equal to N, the training sample sub-numbers corresponding to the A target submodel parameters are obtained, the A training sample sub-numbers are summed to obtain the total number of training samples, the A target submodel parameters comprise target submodel parameters Zt, the A training sample sub-numbers comprise training sample sub-numbers Yt corresponding to the target submodel parameters Zt, t is a positive integer and t is smaller than or equal to A, the operator results of the target submodel parameters Zt and the training sample sub-numbers Yt are obtained, the operator results corresponding to the A target submodel parameters are summed to obtain the total operation result, and the center model parameters are determined according to the total operation result and the total number of training samples.
It can be understood that the central node may aggregate the sub-model parameters provided by different user nodes, so as to learn the data features in the training data corresponding to each user node, and assume that the number of the target sub-model parameters is 3, please refer to fig. 5, and fig. 5 is a schematic view of a data processing scenario provided by the embodiment of the present application. As shown in fig. 5, the central node 40a acquires 3 target sub-model parameters, which are a target sub-model parameter Z1, a target sub-model parameter Z2, and a target sub-model parameter Z3, respectively. The central node 40a obtains a respective number of training sample sub-numbers for each target sub-model parameter, which may be equivalent to the training data illustrated in fig. 2a, in fig. 5, the number of training sample sub-numbers Y1 for the target sub-model parameter Z1 is equal to 10, the number of training sample sub-numbers Y2 for the target sub-model parameter Z2 is equal to 15, and the number of training sample sub-numbers Y3 for the target sub-model parameter Z3 is equal to 20.
The center node 40a sums the 3 sub-numbers of training samples to obtain a total number of training samples, obtains the total number of operation sub-numbers of the target sub-model parameter Z1 and the training sample sub-number Y1, obtains the operator results of the target sub-model parameter Z2 and the training sample sub-number Y2, obtains the operator results of the target sub-model parameter Z3 and the training sample sub-number Y3, sums the operator results respectively corresponding to the 3 target sub-model parameters to obtain a total operation result, the center node 40a may determine the center model parameter according to the total operation result and the total training sample number (45 in the embodiment of the present application), the center node 40a may determine the center model parameter by the formula (3),
Calculating the target submodel parameter Z1, the target submodel parameter Z2, the target submodel parameter Z3, the number of 3 training sample submultiples and the total number of training samples to obtain a central model parameter, wherein K in the formula (3) is equal to A which is shown as an embodiment of the application, namely 3, ni represents the number of ith training sample submodel, such as 10, 15 and 20 which are shown as an embodiment of the application, and omegai is equal to the ith target submodel parameter which is shown as an embodiment of the application, such as the target submodel parameter Z1, the target submodel parameter Z2 and the target submodel parameter Z3 which are shown as an embodiment of the application.
It should be noted that, equation (3) is an aggregation algorithm illustrated in the embodiment of the present application, and other aggregation algorithms may be selected according to the scene when the present application is actually applied.
The process of performing the model decision quality evaluation on the central model parameter to obtain the target quality evaluation result may refer to the above step S102 of performing the model decision quality evaluation on the sub-model parameter D1 to obtain the description of the quality evaluation result corresponding to the sub-model parameter D1, which is not described herein.
The central node 40a can make other central nodes and light nodes in the blockchain network share the target block according to the target quality evaluation result, and the sharing mode can avoid a large amount of calculation force, improve the transmission speed of parameters (including central model parameters), and can not be cracked or even attacked by human violence so as to adapt to the parameter transmission in a severe environment.
Step S104, generating a target block according to the central model parameters and the target quality evaluation result, broadcasting the target block to N user nodes respectively when the target block is identified through a block chain, so that each user node performs validity verification on the target block according to the target quality evaluation result respectively, and acquiring a task decision model containing the central model parameters when the result of validity verification indicates that the target block is a valid block, wherein the task decision model is used for deciding a node for executing a task.
The method comprises the steps of obtaining iteration times corresponding to central model parameters, generating a target block according to the iteration times, the central model parameters and a target quality evaluation result, obtaining a first digital abstract of the target block, encrypting the first digital abstract according to a private key to obtain a first digital signature, adding the first digital signature into the target block, and indicating N user nodes to perform source legal signature verification on the target block by the first digital signature.
In a possible implementation manner, through step S102, a central node of a quality evaluation result corresponding to a sub-model parameter acquired by the central node is first calculated in the blockchain consensus network, and the central node is taken as a leading central node, and the leading central node is taken as a block outlet node, so as to generate a target block (the target block may be equal to an uplink block). In the embodiment of the present application, assuming that the central node 60a in fig. 6 (which is equivalent to the central node 40a described above) is the central node that has calculated the quality evaluation result corresponding to the target sub-model parameter acquired by the central node itself first, at this time, other central nodes may continue to calculate the quality evaluation result corresponding to the target sub-model parameter acquired by the central node itself.
The central node 60a may generate a target block according to the central model parameters and the target quality evaluation result, and fig. 6 is a schematic view of a data processing scenario according to an embodiment of the present application. As shown in fig. 6, the central node 60a obtains the iteration number corresponding to the central model parameter, where the iteration number is equal to the legal iteration number and is also equal to the iteration number corresponding to each target sub-model parameter, and the central node 60a may generate the target block 60d according to the iteration number, the central model parameter and the target quality evaluation result.
Each blockchain node in the blockchain network has a pair of keys (also referred to as Public-private Key pairs), each pair having a Public Key (Public Key) and a private Key (PRIVATE KEY). The private key may be a hexadecimal string of a randomly generated number string operated by a hash algorithm, and the public key may be generated by using the private key through an elliptic encryption algorithm. Since the elliptic encryption algorithm is a one-way function, the public key can be generated by the private key, but the private key cannot be derived by the public key, so that the public key can be disclosed, but the private key needs to be hidden.
The digital signature is a segment of anti-counterfeiting character string consisting of a digital abstract and an asymmetric encryption technology, the transmitting node shortens transaction information into a character string with a fixed length through the digital abstract technology, then encrypts the digital abstract by using a private key to form the digital signature, and then transmits the digital signature to the receiving node. The receiving node can verify that the information was sent by the sending node by means of the digital signature and the public key of the sending node.
Referring to fig. 6 again, the central node 60a may obtain a first digital digest 60b of the target block 60d by using a key pair and a digital signature technique, encrypt the first digital digest 60b according to its own private key 60c to obtain a first digital signature 60e, and add the first digital signature 60e to the target block 60d, where the first digital signature 60e is used to instruct each user node to perform source legal signature verification on the target block.
Referring to fig. 6 again, the block header in the target block 60d may include a preamble hash (i.e., a parent block hash), may include a first digital digest 60b, may include a timestamp, and the block body of the target block 60d may include the public key of the central node 60a, the first digital signature 60e, the number of legal iterations, the target quality assessment result, and the central model parameters. The embodiment of the application does not limit the data in the target block 60d, and can be set according to the actual application scene.
After the central node 60a generates the target block 60d, the target block 60d is broadcast to central nodes in the blockchain consensus network (which may be equivalent to the total number of nodes in the consensus network 10d in fig. 1 b) so that other central nodes perform blockchain consensus on the target block 60d, if the target block does not pass the blockchain consensus, the blockchain consensus network performs blockchain consensus on a block generated by other central nodes (the generated data of the block is different from the data in the target block 60d because of different user nodes), and when the target block 60d passes the blockchain consensus, other central nodes may suspend calculating quality evaluation results corresponding to the target submodel parameters acquired by themselves or discard the target submodel parameters acquired by themselves. The central node 60a and other central nodes in the blockchain consensus network may broadcast the target block 60d into the blockchain network, where it is mainly the light node in the blockchain network (which may be equivalent to the light node in the synchronization network 10e in fig. 1 b) that synchronizes the target block 60d, and subsequently, the user node may acquire the central model parameters in the target block 60d, and further may acquire the task decision model including the central model parameters, and may also understand that the local sub-model parameters are updated to the central model parameters.
In the specific process of verifying the validity of the target block according to the target quality evaluation result, please refer to the embodiment corresponding to fig. 7 below, in step S208, the detailed description about performing blockchain consensus on the block to be consensus is omitted here.
This is done by the federal machine learning process, during which the blockchain also completes a round of "collect information-contend for leadership-produce block-broadcast block". According to the embodiment of the application, the block chain is innovatively embedded into federal learning, so that the task decision model can greatly improve the capability of resisting severe environments, and the stability and safety of task unloading are enhanced.
For other possible implementations of determining the uplink block, please refer to the description in step S208 in the example corresponding to fig. 7 below, which is not described here.
In the embodiment of the application, the sub-model parameters respectively provided by N user nodes are regarded as N transactions, and the consensus process is designed to respectively carry out model decision quality assessment on the N sub-model parameters according to task unloading decision scenes, namely, the task unloading decision levels corresponding to the N sub-model parameters are assessed, further, the sub-model parameters with the quality assessment result meeting the model consensus condition are determined as target sub-model parameters, which are equivalent to the sub-model parameters with the reserved task unloading decision levels and the sub-model parameters with the low task unloading decision levels, further, the target sub-model parameters respectively provided by different user nodes are aggregated to obtain the center model parameters, and the aggregated center model parameters not only comprise the characteristics of the target sub-model parameters, but also have the task unloading capacity with the high level, further, the model decision quality assessment is carried out on the center model parameters to obtain the target quality assessment result, the target block is generated according to the center model parameters, when the target block passes through block chain consensus, the target block is respectively broadcast to the N user nodes, namely, the target block can be provided with the legal decision-making function of the target block, the user node is provided with the legal decision-making function, and the target block quality of the target block can be validated, and the legal decision block can be provided with the characteristics of the target block can be validated when the user node is validated, and the user node is provided with the target node has the legal decision-quality model corresponding function, with a high level of task offloading decision-making level. In summary, by providing the sub-model parameters by different user nodes (i.e. training the sub-model parameters by different user nodes in parallel), the generation speed of the task decision model can be increased, the time cost of data processing can be saved, and the training sample in each user node is not required to be directly obtained, so that the problem that the training sample is difficult to obtain can be avoided, and a plurality of sub-model parameters come from the training sample, so that the decision precision of the task decision model can be ensured, the target sub-model parameters are obtained through screening the sub-model parameters, the decision precision of the task decision model is further improved, and in addition, the central model parameters can be prevented from being stolen and tampered through the target block, and the safety of data processing can be further improved.
Further, referring to fig. 7, fig. 7 is a flow chart of a data processing method according to an embodiment of the application. The data processing method may be executed by a computer device, and in the embodiment of the present application, the computer device is taken as a central node as an example, where the central node may be any one cloud server in fig. 1a (equivalent to any one full-scale node in fig. 1 b). As shown in fig. 7, the data processing process may include the following steps.
Step S201, obtaining C information chains to be verified, wherein C is a positive integer and is greater than or equal to N, the C information chains to be verified all comprise submodel parameters, the C submodel parameters are respectively provided by C user nodes, the N user nodes belong to the C user nodes, and the N submodel parameters belong to the C submodel parameters.
Because deep reinforcement learning requires the application of a neural network, which results in a large amount of computation required to train a model and great stress on user nodes, the embodiment of the application introduces a distributed training method, federal learning. Federal learning is a machine learning framework that can effectively help multiple institutions perform data usage and machine learning modeling while meeting the requirements of user privacy protection, data security, and government regulations. The federal learning is used as a distributed machine learning paradigm, so that the problem of data island can be effectively solved, participants can jointly model on the basis of not sharing data, and the data island can be broken technically, so that the privacy of users is protected as much as possible. In the embodiment of the application, the flow of federal machine learning may be that a plurality of user nodes respectively train the service offloading decision algorithm independently at local, and after training for a certain number of times, each user node respectively uploads network parameters (i.e., sub-model parameters) in the local neural network to edge nodes, and each edge node gathers the sub-model parameters to a central node or directly uploads the sub-model parameters to the central node, where the specific process may be described in fig. 2a above and will not be repeated here. The number of the edge nodes and the number of the center nodes corresponding to each other are not limited, and the edge nodes and the center nodes can be set according to actual application scenes.
In the embodiment of the present application, it is assumed that c=5, that is, the central node obtains 5 information chains to be verified, and the user nodes corresponding to the 5 information chains to be verified respectively may be user terminals in the user terminal cluster in fig. 1a, which is equivalent to the light node (the blockchain node corresponding to the user terminal) in the synchronization network 10e in fig. 1 b. Referring to fig. 8, fig. 8 is a schematic diagram of a scenario of data processing according to an embodiment of the present application. As shown in fig. 8, the central node 80a obtains information chains to be verified provided by 5 user nodes respectively, where the user node 401c provides the information chain to be verified 801b, the information chain to be verified 801b includes a submodel parameter D1, the user node 402c provides the information chain to be verified 802b, the information chain to be verified 802b includes a submodel parameter D2, the user node 403c provides the information chain to be verified 803b, the information chain to be verified 803b includes a submodel parameter D3, the user node 404c provides the information chain to be verified 804b, the information chain to be verified 804b includes a submodel parameter D4, the user node 405c provides the information chain to be verified 805b, and the information chain to be verified 805b includes a submodel parameter D5.
Step S202, respectively verifying the C information chains to be verified, and determining the information chain to be verified, the verification result of which meets the quality evaluation condition, as a target information chain.
The method comprises the steps of carrying out verification on C information chains to be verified according to the number of iterations to be verified, adding the information chains to be verified, of which the number of iterations to be verified is equal to the legal number of iterations, into a set of information chains to be verified, wherein the total number of the information chains to be verified in the set of information chains to be verified is smaller than or equal to C, and is equal to or greater than N, the information chains to be verified in the set of information chains to be verified further comprise a second digital signature, carrying out verification on the information chains to be verified in the set of information chains to be verified according to the second digital signature, and determining the information chains to be verified, of which verification results meet quality evaluation conditions, as target information chains.
The information chain set to be verified comprises information chains Fx to be verified, x is a positive integer, x is smaller than or equal to the total number of the information chains to be verified in the information chain set to be verified, and the second digital signature comprises a second digital signature Jx in the information chain Fx to be verified. The specific process of verifying the information chain to be verified in the information chain set to be verified according to the second digital signature can comprise decrypting the second digital signature Jx according to a public key associated with the information chain to be verified Fx to obtain a second digital digest, obtaining data to be verified in the information chain to be verified Fx, wherein the data to be verified comprises legal iteration times and submodel parameters of the information chain to be verified Fx, obtaining a third digital digest of the data to be verified, comparing the second digital digest with the third digital digest, determining that a verification result of the information chain to be verified Fx meets a quality evaluation condition if the second digital digest is identical to the third digital digest, determining that the verification result of the information chain to be verified Fx does not meet the quality evaluation condition if the second digital digest is not identical to the third digital digest, and deleting the information chain to be verified that the verification result does not meet the quality evaluation condition.
Referring to fig. 8 again, the process of verifying the 5 information chains to be verified by the central node 80a may include two steps, where the first step is to determine whether the number of iterations to be verified in each information chain to be verified is equal to the legal number of iterations, and obtain the number of iterations to be verified in each information chain to be verified, where the information chain to be verified 801b includes the number of iterations to be verified 200 as illustrated in fig. 8, the information chain to be verified 802b includes the number of iterations to be verified 200 as illustrated in fig. 8, the information chain to be verified 803b includes the number of iterations to be verified 200 as illustrated in fig. 8, the information chain to be verified 804b includes the number of iterations to be verified 200 as illustrated in fig. 8, and the information chain to be verified 805b includes the number of iterations to be verified 199 as illustrated in fig. 8. Assuming that the number of legal iterations is 200, by comparing the number of iterations to be verified with the number of legal iterations, it is known that the information chain to be verified 805b is an illegal information chain, at this time, the central node 80a may discard the information chain to be verified 805b, add the information chain to be verified whose number of iterations to be verified is equal to the number of legal iterations to the information chain set to be verified, as shown in fig. 8, the information chain set to be verified may include the information chain to be verified 801b (equivalent to the information chain to be verified F1), the information chain to be verified 802b (equivalent to the information chain to be verified F2), the information chain to be verified 803b (equivalent to the information chain to be verified F3), and the information chain to be verified 804b (equivalent to the information chain to be verified F4).
The second step of verifying the information chain to be verified by the central node 80a is to verify the information chain to be verified according to the second digital signature in the information chain to be verified, which can also be understood as verifying the information chain to be verified. As shown in fig. 8, information chain to be verified 801b includes a second digital signature J1, information chain to be verified 802b includes a second digital signature J2, information chain to be verified 803b includes a second digital signature J3, information chain to be verified 804b includes a second digital signature J3, and information chain to be verified 804b includes a second digital signature J4.
In this step, only the information chain 801b to be verified is checked as an example, and the specific process of checking the other information chains to be verified can be referred to the following description, which is not repeated. Referring to fig. 8 again, the central node 80a obtains the public key of the user node 401c (equivalent to the public key associated with the information chain to be verified F1), decrypts the second digital signature J1 according to the public key of the user node 401c to obtain the second digital digest 801c, and the central node 80a obtains the data to be verified in the information chain to be verified 801b, where the data to be verified may include the legal iteration number and the submodel parameter D1 of the information chain to be verified 801b, and it is understood that the data to be verified may be set according to the actual application scenario.
The central node 80a obtains the third digital digest 802c of the data to be verified, compares the second digital digest 801c with the third digital digest 802c, and determines that the verification result of the information chain 801b to be verified meets the quality evaluation condition if the second digital digest 801c is identical to the third digital digest 802c, as shown in fig. 8, the central node 80a may determine the information chain 801b to be verified as the target information chain, and determines that the verification result of the information chain 801b to be verified does not meet the quality evaluation condition if the second digital digest 801c is not identical to the third digital digest 802c, at this time, the central node 80a may delete the information chain 801b to be verified.
Step S203, N target information chains are obtained, N is a positive integer, each of the N target information chains comprises sub-model parameters, and the N sub-model parameters are provided by N user nodes respectively.
Step S204, model decision quality evaluation is respectively carried out on N sub-model parameters, N quality evaluation results are obtained, N quality evaluation results are respectively compared with quality evaluation result thresholds to obtain N comparison results, the N comparison results comprise comparison results Gj, j is a positive integer and is smaller than or equal to N, the comparison results Gj comprise first comparison results or second comparison results, the first comparison results are used for representation, the quality evaluation results corresponding to the comparison results Gj are smaller than the quality evaluation result thresholds, the second comparison results are used for representation, and the quality evaluation results corresponding to the comparison results Gj are equal to or larger than the quality evaluation result thresholds.
In step S205, if the comparison result Gj is the second comparison result, it is determined that the quality evaluation result corresponding to the comparison result Gj meets the model consensus condition, and if the comparison result Gj is the first comparison result, it is determined that the quality evaluation result corresponding to the comparison result Gj does not meet the model consensus condition, and the sub-model parameters for which the quality evaluation result does not meet the model consensus condition are deleted.
In particular, although the introduction of the federal learning makes the training of the task decision model more efficient, the federal learning has risks in the parameter transmission process, in addition, compared with the traditional distributed training, the federal learning can achieve that 10 ten thousand user nodes perform parameter training and aggregation simultaneously, so that two problems are inevitably encountered in an edge environment, namely, 1) how to prevent updated parameters (namely, central model parameters) from being tampered in the downloading process, and 2) how to identify garbage parameters maliciously uploaded by untrusted nodes.
Therefore, the embodiment of the application introduces a safe and efficient data structure, namely the blockchain, which is used as a new data structure, and the data or information stored in the blockchain has the characteristics of 'non-counterfeitability', 'whole-course trace', 'traceability', 'disclosure transparency', 'collective maintenance', and the like. Based on the characteristics, the blockchain technology lays a solid 'trust' foundation, creates a reliable 'cooperation' mechanism, and is extremely suitable for being used in federal learning under severe environments.
The specific process of the step S203 and the model decision quality evaluation for the N sub-model parameters may be referred to the description of the step S101-the step S102 in the embodiment corresponding to fig. 3, and will not be described herein.
Assuming n=3, please refer to fig. 9 together, the center node 90b obtains 3 quality evaluation results, namely, a quality evaluation result 901a corresponding to the sub-model parameter D1, a quality evaluation result 902a corresponding to the sub-model parameter D2, and a quality evaluation result 903a corresponding to the sub-model parameter D3, it can be understood that the quality evaluation result may be a score, that is, a level of unloading of the sub-model parameter decision task, and in colloquial, if the quality evaluation result 903a is 90 minutes, the center node 90b may determine that the decision level of the sub-model parameter D3 is very good, and if the quality evaluation result 902a is 40 minutes, the center node 90b may determine that the decision level of the sub-model parameter D2 is very poor.
The center node 90b compares the 3 quality evaluation results with the quality evaluation result threshold 90a, so as to obtain 3 comparison results, and in this step, only the comparison result G1 corresponding to the submodel parameter D1 is taken as an example for description, and other description of the comparison result G2 can be referred to below. The comparison result G1 includes a first comparison result or a second comparison result, the first comparison result is used for representing that the quality evaluation result 901a corresponding to the comparison result G1 is smaller than the quality evaluation result threshold 90a, and the second comparison result is used for representing that the quality evaluation result 901a corresponding to the comparison result G1 is equal to or greater than the quality evaluation result threshold 90a. Assuming that the quality evaluation result threshold 90a is 80, if the quality evaluation result 901a is smaller than 80, the comparison result G1 is determined to be a first comparison result, at this time, the center node 90b may determine that the quality evaluation result 901a does not satisfy the model consensus condition, and if the quality evaluation result 901a is equal to or greater than 80, the comparison result G1 is determined to be a second comparison result, at this time, it may be determined that the quality evaluation result 901a satisfies the model consensus condition.
And S206, determining the sub-model parameters with the quality evaluation result meeting the model consensus condition as target sub-model parameters, wherein the total number of the target sub-model parameters is less than or equal to N.
Specifically, in combination with fig. 3 and steps S201 to S205, it is known that the central node will reject the sub-model parameters that are not passed by signature verification, have low unit benefit value (i.e. poor quality evaluation result), and non-uniform iteration times, so as to ensure that the collected sub-model parameters are all real and effective, further prevent the untrustworthy user from maliciously uploading the garbage parameters, and ensure successful execution of the aggregation algorithm for the target sub-model parameters in step S207.
Step S207, aggregation processing is carried out on the target sub-model parameters to obtain center model parameters, and model decision quality assessment is carried out on the center model parameters to obtain a target quality assessment result.
Specifically, for the specific process of step S207, refer to the description of step S103 in the embodiment corresponding to fig. 3, which is not described herein. The specific process of performing the model decision quality evaluation on the center model parameter is similar to the specific process of performing the model decision quality evaluation on the N sub-model parameters in step S102 in the embodiment corresponding to fig. 3, so that the detailed description is omitted herein, please refer to the description of step S102 in the embodiment corresponding to fig. 3.
Step S208, generating a target block according to the central model parameters and the target quality evaluation result, broadcasting the target block to N user nodes respectively when the target block is identified through a block chain, so that each user node performs validity verification on the target block according to the target quality evaluation result respectively, and acquiring a task decision model containing the central model parameters when the result of validity verification indicates that the target block is a valid block, wherein the task decision model is used for deciding a node for executing a task.
Specifically, for a specific process of generating the target block, refer to the description of step S104 in the embodiment corresponding to fig. 3, which is not described herein.
In the example corresponding to fig. 3, the leading center node is the center node that has first calculated the quality evaluation result corresponding to the sub-model parameters acquired by the leading center node, and in this step, other possible embodiments for determining the leading center node are described.
After generating the target block, the central node broadcasts the target block in the blockchain consensus network, so that the central node in the blockchain consensus network performs consensus on the target block according to the central model parameters and the target quality evaluation result, please refer to fig. 10, and fig. 10 is a schematic view of a data processing scenario provided in the embodiment of the present application. As shown in fig. 10, the blockchain consensus network 90d may include a central node 903h and a central node 902h, where the processes of the central node 903h and the central node 902h for blockchain consensus on the target block 90c are referred to below, and the processes of the central node 90b for blockchain consensus on the to-be-consensus block are not described herein.
If the target block 90c is already known by the blockchain and the central node 90b does not acquire the blocks to be known sent by other central nodes, at this time, the central node 90b may be used as a leading central node, the target block 90c may be used as a uplink block for uplink processing, and the central node 902h and the central node 903h may respectively delete the local information chain for billing processing of the uplink block.
When a target block does not pass through the block chain consensus and a block to be consensus broadcasted by a central node is obtained, the block to be consensus is carried out, the block to be consensus is generated by the central node according to H submodel parameters which are different from N submodel parameters, the H submodel parameters are respectively provided by H user nodes, the H user nodes are communicated with the central node and are different from N user nodes, H is a positive integer, if the target block does not pass through the block chain consensus when the block to be consensus passes through the block chain consensus, the block to be consensus is determined to be an uplink block, the uplink block is processed in a billing mode, the target block is deleted, and the uplink block is respectively broadcasted to nodes in a block chain network.
The specific process of performing blockchain consensus on the block to be consensus can comprise the steps of obtaining a center model parameter to be consensus in the block to be consensus and a first quality evaluation result, performing model decision quality evaluation on the center model parameter to be consensus to obtain a second quality evaluation result, determining a difference value between the first quality evaluation result and the second quality evaluation result, comparing the difference value with a difference value threshold, if the difference value is smaller than or equal to the difference value threshold, determining that the consensus result of the block to be consensus passes, and if the difference value is larger than the difference value threshold, determining that the consensus result of the block to be consensus fails.
Referring to fig. 10 again, when the target block 90c fails to pass through the blockchain consensus and obtains the to-be-consensus block 90e broadcasted by the central node 903h, the central node 90b needs to perform blockchain consensus on the to-be-consensus block 90e, which may be specifically implemented in a process that the central node 90b obtains to-be-consensus central model parameters 901f and first quality evaluation results 902f in the to-be-consensus block 90e, where the target sub-model parameters corresponding to the to-be-consensus central model parameters 901f are not equal to the target sub-model parameters corresponding to the central model parameters, and assuming that there are 1 ten thousand user nodes each independently training a model, 1 ten thousand to-be-verified information chains may be provided, and 1 ten thousand to-be-verified information chains include 9500 target information chains, wherein 5000 target information chains are transmitted to the central node 90b and 4500 target information chains are transmitted to the central node 903h. Since federal learning is different from conventional distributed learning, it has strong robustness to independent samples with the same distribution, and even if some parameters are not uploaded to the same central node, the aggregation of the subsequent target sub-model parameters is not affected, for example, 9500 target information chains exemplified above are transferred to two central nodes, and in fact, the central model parameters in the uplink block are generated according to 5000 target information chains acquired by the central node 90b (at this time, the central node 903h may delete 4500 target information chains acquired by itself), or 4500 target information chains acquired by the central node 903h (at this time, the central node 90b may delete 5000 target information chains acquired by itself).
Referring to fig. 10 again, the center node 90b performs model decision quality evaluation on the to-be-recognized center model parameter 901f to obtain a second quality evaluation result 903f, determines a difference between the first quality evaluation result 902f and the second quality evaluation result 903f, compares the difference with a difference threshold, if the difference is smaller than or equal to the difference threshold, the center node 90b determines that the consensus result of the to-be-recognized block 90e is a consensus pass, and if the difference is greater than the difference threshold, the consensus result of the to-be-recognized block 90e is a consensus fail. Assuming that the first quality evaluation result 902f is 80 and the second quality evaluation result 903f is 90, it is obvious that the center node 90b passes the consensus result for the block to be consensus 90e because the first quality evaluation result 902f is worse than the second quality evaluation result 903 f. Assuming that the first quality evaluation result 902f is 90 and the second quality evaluation result 903f is 80, at this time, the first quality evaluation result 902f is better than the second quality evaluation result 903f, the difference between the two results is 10, if the difference threshold is greater than or equal to 10, the consensus result of the central node 90b for the block to be consensus 90e is that the consensus passes, and if the difference threshold is less than 10, the consensus result of the central node 90b for the block to be consensus 90e is that the consensus fails.
In addition to comparing the first quality assessment result 902f with the second quality assessment result 903f to determine a consensus result for the block to be consensus 90e, the blockchain consensus network 90d may set a quality assessment result threshold (may be equivalent to the quality assessment result threshold described above), e.g., the quality assessment result threshold is 60, and if the second quality assessment result 903f is greater than or equal to 60, the central node 90b may determine that the consensus result for the block to be consensus 90e is a consensus pass, and if the second quality assessment result 903f is less than 60, may determine that the consensus result for the block to be consensus 90e is a consensus fail.
By adopting the method for evaluating the model decision quality of the parameters (including the center model parameters to be commonly known and the center model parameters), the application can prevent the malicious tampering of the parameters by illegal nodes, when someone tries to construct a false parameter and wants to incorporate a blockchain for broadcasting, the lower end (including the edge end and the user end) can verify the parameters after receiving the blockchain, if the verification result (such as the second quality evaluation result 903 f) is out of the range of the specified quality evaluation result threshold, the false parameter can be recognized, if the verification result is in the range, the constructed false parameter is proved to be still effective, and the falsification does not have negative influence on the task decision model, thereby preventing the situation of parameter tampering in the angle of game theory.
If the target block does not pass the blockchain consensus when the block to be consensus passes the blockchain consensus, at this time, the blockchain consensus network takes the central node that generates the block to be consensus as the leading central node, such as the central node 903h illustrated in fig. 10, and determines the block to be consensus as the uplink block, the central node 90b in fig. 10 performs accounting processing on the uplink block (i.e., the block to be uplink 90 e), and deletes the target block 90c, and then broadcasts the uplink block to the nodes in the blockchain network respectively.
When the target block passes through the blockchain consensus and the to-be-consensus block broadcast by the central node does not pass through the blockchain consensus temporarily, the target block is broadcast to nodes in the blockchain network respectively, the blockchain network comprises the blockchain consensus network, the nodes in the blockchain network comprise N user nodes and the central node, the central node is other central nodes in the blockchain consensus network, for example, the central node 902h and the central node 903h in the blockchain consensus network 90d illustrated in fig. 10, the central node for executing the method is not included, namely, the central node 90b, and the other central nodes are used for deleting the to-be-consensus block when the target block passes through the blockchain consensus and performing accounting processing on the target block.
When the target block passes through the block chain consensus, and the block to be consensus passes through the block chain consensus, a first quality evaluation result in the block to be consensus is obtained, the first quality evaluation result is compared with the target quality evaluation result, if the first quality evaluation result is larger than the target quality evaluation result, the block to be consensus is determined to be an uplink block, accounting processing is carried out on the uplink block, the target block is deleted, the uplink block is respectively broadcasted to nodes in the block chain network, and if the first quality evaluation result is equal to or smaller than the target quality evaluation result, the target block is respectively broadcasted to the nodes in the block chain network.
Referring to fig. 11, fig. 11 is a schematic diagram of a data processing scenario provided by an embodiment of the present application, assuming that two blocks are commonly recognized by a blockchain in a blockchain common-knowledge network. As shown in fig. 11, in order to avoid the situation that the blockchain is bifurcated, the target block 90c and the block 90e to be consensus are commonly identified by the blockchain, and it is required to determine the uplink block in the target block 90c and the block 90e to be consensus, the determining process may be that the center node 90b obtains quality evaluation results respectively included in the two blocks, the target block 90c includes a target quality evaluation result 904f in combination with the description in fig. 10, the block 90e to be consensus includes a first quality evaluation result 902f, the center node 90b compares the target quality evaluation result 904f with the first quality evaluation result 902f, as shown in fig. 11, to obtain a comparison result 90i, and if the comparison result 90i is equal to or greater than the first quality evaluation result 902f, the block chain consensus network 90d determines the center node 90b as the leading center node, and the target block 90c as the uplink block, and if the comparison result 90i is less than the first quality evaluation result 902f, the center node 90d determines the center node 90d as the leading center node and determines the leading block 90 e.
The total delay time and the total energy consumption of all user nodes are comprehensively considered, and how to dynamically determine the unloading mode of each task, so that the total waiting time and the total energy consumption of all user nodes are the shortest, and the key for influencing the coordination of edge calculation and cloud calculation is the influence. As edge computing technology matures, methods for offloading tasks to the edge are being applied to more and more fields. However, unlike a stable urban environment, in severe environments such as navigation, the edge resources are more deficient and the reliability requirement is higher, which puts forward higher requirements on edge computing technology, and the method is characterized in that 1) the edge nodes in the stable environment are more, the communication resources are abundant, the edge nodes in the severe environment are short, the communication resources are short, 2) the interference of the stable environment is less, the severe environment has unreliable nodes and even is possibly attacked by malicious attacks, 3) the network structure of the severe environment is unstable, the connection of the nodes is suddenly interrupted, 4) the stability environment is focused on improving the resource utilization efficiency and the task unloading decision effect, and the severe environment is more focused on the reliability and the safety. How to enable an edge offload system to have a certain resistance to harsh environments is becoming a key to the development of edge computing.
Referring to fig. 12 in conjunction with the content of the embodiments corresponding to fig. 3 and fig. 7, fig. 12 is a schematic structural diagram of a task offloading decision frame according to an embodiment of the present application. As shown in fig. 12, the task offloading decision framework may include a client-edge-cloud, and the three may be connected to each other, and the task of the client (i.e., the user node) may be offloaded to a plurality of edge servers or cloud servers.
As shown in fig. 12, after generating the task, the user side transmits the task information to a task offloading decision algorithm (equivalent to a task decision model) at the user side, and after determining a decision, the task offloading decision algorithm offloads the task according to the decision to obtain an offloading result. Meanwhile, as the task offloading decision algorithm is applied to the neural network, in order to improve the training speed of the neural network at the user side, federal learning is introduced, the user side uploads the neural network parameters (namely the sub-model parameters) to the edge server every other training time, the edge server gathers the acquired sub-model parameters to the cloud server, the cloud server carries out sub-model parameter aggregation processing, the aggregated parameters (namely the central model parameters) are then put down to the user side, and the user side carries out parameter updating on the locally generated sub-model parameters according to the central model parameters. In order to ensure that the central model parameters cannot be tampered in the parameter updating process, the application introduces a data structure based on a block chain, and ensures the safety and the high efficiency in the parameter transmission process by creating a new consensus algorithm.
By introducing blockchains into federal learning, security and stability in the central model parameter transfer process is ensured. In addition, the task offloading decision algorithm in the embodiment of the application can ensure that two mutually untrusted nodes can carry out safe transmission of information under the condition of disconnection with the trusted nodes.
In the embodiment of the application, the sub-model parameters respectively provided by N user nodes are regarded as N transactions, and the consensus process is designed to respectively carry out model decision quality assessment on the N sub-model parameters according to task unloading decision scenes, namely, the task unloading decision levels corresponding to the N sub-model parameters are assessed, further, the sub-model parameters with the quality assessment result meeting the model consensus condition are determined as target sub-model parameters, which are equivalent to the sub-model parameters with the reserved task unloading decision levels and the sub-model parameters with the low task unloading decision levels, further, the target sub-model parameters respectively provided by different user nodes are aggregated to obtain the center model parameters, and the aggregated center model parameters not only comprise the characteristics of the target sub-model parameters, but also have the task unloading capacity with the high level, further, the model decision quality assessment is carried out on the center model parameters to obtain the target quality assessment result, the target block is generated according to the center model parameters, when the target block passes through block chain consensus, the target block is respectively broadcast to the N user nodes, namely, the target block can be provided with the legal decision-making function of the target block, the user node is provided with the legal decision-making function, and the target block quality of the target block can be validated, and the legal decision block can be provided with the characteristics of the target block can be validated when the user node is validated, and the user node is provided with the target node has the legal decision-quality model corresponding function, with a high level of task offloading decision-making level. In summary, by providing the sub-model parameters by different user nodes (i.e. training the sub-model parameters by different user nodes in parallel), the generation speed of the task decision model can be increased, the time cost of data processing can be saved, and the training sample in each user node is not required to be directly obtained, so that the problem that the training sample is difficult to obtain can be avoided, and a plurality of sub-model parameters come from the training sample, so that the decision precision of the task decision model can be ensured, the target sub-model parameters are obtained through screening the sub-model parameters, the decision precision of the task decision model is further improved, and in addition, the central model parameters can be prevented from being stolen and tampered through the target block, and the safety of data processing can be further improved.
Further, referring to fig. 13, fig. 13 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application. The data processing apparatus may be a computer program (comprising program code) running in a computer device, for example, the data processing apparatus being an application software, the apparatus being adapted to perform the corresponding steps of the method provided by the embodiments of the application. As shown in fig. 13, the data processing apparatus 1 may include a first acquisition module 11, a first evaluation module 12, a second evaluation module 13, and a generation block module 14.
The first acquisition module 11 is used for acquiring N target information chains, wherein N is a positive integer, and the N target information chains comprise sub-model parameters which are respectively provided by N user nodes;
The first evaluation module 12 is configured to perform model decision quality evaluation on the N sub-model parameters, and determine sub-model parameters with quality evaluation results satisfying model consensus conditions as target sub-model parameters;
the second evaluation module 13 is configured to aggregate the target sub-model parameters to obtain a central model parameter, and perform model decision quality evaluation on the central model parameter to obtain a target quality evaluation result;
The generating block module 14 is configured to generate a target block according to the central model parameter and the target quality evaluation result, broadcast the target block to N user nodes when the target block is identified by the blockchain, so that each user node performs validity verification on the target block according to the target quality evaluation result, and obtain a task decision model including the central model parameter when the result of validity verification indicates that the target block is a valid block, where the task decision model is used for deciding a node that performs a task.
The specific functional implementation manners of the first obtaining module 11, the first evaluating module 12, the second evaluating module 13, and the generating block module 14 may refer to step S101-step S104 in the corresponding embodiment of fig. 3, and are not described herein.
Referring to fig. 13, the N submodel parameters include submodel parameters Di, i is a positive integer, and i is less than or equal to N;
the first evaluation module 12 may include a first acquisition unit 121, a second acquisition unit 122, a third acquisition unit 123, and an evaluation model unit 124.
A first obtaining unit 121, configured to obtain a simulation task, and obtain a task decision sub-model Mi including a sub-model parameter Di;
The second obtaining unit 122 is configured to input the simulated task to the task decision sub-model Mi, and obtain a task decision node for the simulated task output by the task decision sub-model Mi;
A third obtaining unit 123, configured to obtain task decision loss and obtain task environment information according to the task decision node and the simulation task;
And the evaluation model unit 124 is configured to perform model decision quality evaluation on the sub-model parameter Di according to the task decision loss and the task environment information, so as to obtain a quality evaluation result corresponding to the sub-model parameter Di.
The specific functional implementation manner of the first obtaining unit 121, the second obtaining unit 122, the third obtaining unit 123, and the evaluation model unit 124 may refer to step S102 in the corresponding embodiment of fig. 3, and will not be described herein.
Referring to fig. 13, the task decision loss includes a task decision delay loss and a task decision energy loss, the task environment information includes task information and environment information, the task information is used for representing basic information of a simulation task, and the environment information is used for representing basic information of a task decision node;
The evaluation model unit 124 may include a first processing subunit 1241, a second processing subunit 1242, a third processing subunit 1243, a fourth processing subunit 1244, and a first evaluation subunit 1245.
The first processing subunit 1241 is configured to normalize the task decision delay loss to obtain a unitized task decision delay loss;
The first processing subunit 1241 is further configured to normalize the task decision energy consumption loss to obtain a unitized task decision energy consumption loss;
the second processing subunit 1242 is configured to perform weighted summation processing on the unitized task decision delay loss and the unitized task decision energy loss, so as to obtain the unitized task decision loss;
the third processing subunit 1243 is configured to perform normalization processing on the task information to obtain unitized task information;
The third processing subunit 1243 is further configured to normalize the environmental information to obtain unitized environmental information;
The fourth processing subunit 1244 is configured to sum the unitized task information and the unitized environment information to obtain unitized task environment information;
the first evaluation subunit 1245 is configured to perform model decision quality evaluation on the sub-model parameter Di according to the unitized task decision loss and unitized task environment information.
The specific functional implementation manners of the first processing subunit 1241, the second processing subunit 1242, the third processing subunit 1243, the fourth processing subunit 1244, and the first evaluation subunit 1245 may be referred to the step S102 in the corresponding embodiment of fig. 3, and will not be described herein.
Referring again to fig. 13, the data processing apparatus 1 may further comprise a first determining module 15.
The first evaluation module 12 is further configured to obtain N quality evaluation results, compare the N quality evaluation results with a quality evaluation result threshold to obtain N comparison results, where the N comparison results include comparison results Gj, j is a positive integer and j is less than or equal to N, the comparison result Gj includes a first comparison result or a second comparison result, the first comparison result is used for representing that a quality evaluation result corresponding to the comparison result Gj is less than the quality evaluation result threshold, and the second comparison result is used for representing that a quality evaluation result corresponding to the comparison result Gj is equal to or greater than the quality evaluation result threshold;
The first determining module 15 is configured to determine that the quality evaluation result corresponding to the comparison result Gj meets the model consensus condition if the comparison result Gj is the second comparison result;
The first determining module 15 is further configured to determine that the quality evaluation result corresponding to the comparison result Gj does not satisfy the model consensus condition if the comparison result Gj is the first comparison result, and delete the sub-model parameters that the quality evaluation result does not satisfy the model consensus condition.
The specific functional implementation manner of the first evaluation module 12 and the first determination module 15 may refer to step S204-step S205 in the corresponding embodiment of fig. 7, which is not described herein.
Referring to fig. 13 again, the target submodel parameters include a target submodel parameters, a is a positive integer and a is less than or equal to N;
The second evaluation module 13 may include a first summing unit 131, a fourth acquisition unit 132, and a second summing unit 133.
The first summing unit 131 is configured to obtain a number of training samples corresponding to the a target submodel parameters, and sum the a number of training samples to obtain a total number of training samples, where the a target submodel parameters include a target submodel parameter Zt, and the a number of training samples includes a number of training samples Yt corresponding to a target submodel parameter Zt;
A fourth obtaining unit 132, configured to obtain an operator result of the target submodel parameter Zt and the training sample sub-number Yt;
And the second summing unit 133 is configured to sum the operation sub-results corresponding to the a target sub-model parameters respectively to obtain an operation total result, and determine the central model parameter according to the operation total result and the total number of training samples.
The specific functional implementation manner of the first summing unit 131, the fourth obtaining unit 132, and the second summing unit 133 may refer to step S103 in the corresponding embodiment of fig. 3, which is not described herein.
Referring again to fig. 13, the generation block module 14 may include a first generation unit 141, a second generation unit 142, and a second generation unit 143.
The first generating unit 141 is configured to obtain the iteration number corresponding to the central model parameter, and generate a target block according to the iteration number, the central model parameter, and the target quality evaluation result;
the second generating unit 142 is configured to obtain a first digital digest of the target block, encrypt the first digital digest according to the private key, and obtain a first digital signature;
the second generating unit 143 is further configured to add a first digital signature to the target block, where the first digital signature is used to instruct the N user nodes to perform source legal signature verification on the target block.
The specific functional implementation manner of the first generating unit 141, the second generating unit 142, and the second generating unit 143 may refer to step S104 in the corresponding embodiment of fig. 3, and will not be described herein.
Referring again to fig. 13, the data processing apparatus 1 may further comprise a second determination module 16.
The first obtaining module 11 is further configured to obtain C information chains to be verified, where C is a positive integer and C is greater than or equal to N, where the C information chains to be verified each include a submodel parameter, where the C submodel parameters are provided by C user nodes, where the N user nodes belong to C user nodes, and where the N submodel parameters belong to C submodel parameters;
and the second determining module 16 is configured to verify the C information chains to be verified respectively, and determine the information chain to be verified whose verification result meets the quality evaluation condition as the target information chain.
The specific functional implementation manner of the first obtaining module 11 and the second determining module 16 may refer to step S201 to step S202 in the corresponding embodiment of fig. 7, which is not described herein.
Referring to fig. 13 again, each of the c information chains to be verified includes the number of iterations to be verified;
the second determination module 16 may include a first verification unit 161 and a second verification unit 162.
The first verification unit 161 is configured to verify the C information links to be verified according to the C iteration times to be verified, and add the information links to be verified, the iteration times to be verified being equal to the legal iteration times, to the information link set to be verified;
And a second verification unit 162, configured to verify the information chain to be verified in the information chain set to be verified according to the second digital signature, and determine the information chain to be verified whose verification result meets the quality evaluation condition as the target information chain.
The specific functional implementation manner of the first verification unit 161 and the second verification unit 162 may refer to step S202 in the corresponding embodiment of fig. 7, which is not described herein.
Referring to fig. 13 again, the information chain set to be verified includes information chains Fx to be verified, x is a positive integer, and x is less than or equal to the total number of information chains to be verified in the information chain set to be verified;
the second verification unit 162 may include a first acquisition subunit 1621, a second acquisition subunit 1622, a first determination subunit 1623, and a second determination subunit 1624.
A first obtaining subunit 1621, configured to decrypt the second digital signature Jx according to the public key associated with the information chain Fx to be verified, and obtain a second digital digest;
The second obtaining subunit 1622 is configured to obtain data to be verified in the information chain to be verified Fx, where the data to be verified includes legal iteration times and sub-model parameters of the information chain to be verified Fx;
The second acquisition subunit is further used for acquiring a third digital digest of the data to be verified, and comparing the second digital digest with the third digital digest;
A first determining subunit 1623, configured to determine that the verification result of the information chain Fx to be verified meets the quality evaluation condition if the second digital digest is identical to the third digital digest;
And the second determining subunit 1624 is configured to determine that the verification result of the information chain to be verified Fx does not meet the quality evaluation condition if the second digital digest is different from the third digital digest, and delete the information chain to be verified whose verification result does not meet the quality evaluation condition.
The specific functional implementation manner of the first acquiring subunit 1621, the second acquiring subunit 1622, the first determining subunit 1623, and the second determining subunit 1624 may be referred to the step S202 in the corresponding embodiment of fig. 7, and will not be described herein.
Referring again to fig. 13, the generation block module 14 may include a first broadcasting unit 143 and a second broadcasting unit 144.
A first broadcasting unit 143, configured to broadcast a target block in the blockchain consensus network, so that a central node in the blockchain consensus network performs consensus on the target block according to the central model parameter and the target quality evaluation result;
The second broadcasting unit 144 is configured to broadcast the target blocks to nodes in the blockchain network when the target blocks pass the blockchain consensus and the to-be-consensus blocks broadcast by the central node do not pass the blockchain consensus temporarily, where the blockchain network includes a blockchain consensus network, the nodes in the blockchain network include N user nodes and the central node, and the central node is configured to delete the to-be-consensus blocks and perform accounting processing on the target blocks when the target blocks pass the blockchain consensus.
The specific functional implementation manner of the first broadcasting unit 143 and the second broadcasting unit 144 may refer to step S208 in the corresponding embodiment of fig. 7, which is not described herein.
Referring again to fig. 13, the generating block module 14 may further include a fifth acquiring unit 145.
A fifth obtaining unit 145, configured to obtain a first quality evaluation result in the block to be identified when the target block is identified by the blockchain and the block to be identified is identified by the blockchain;
the fifth obtaining unit 145 is further configured to compare the first quality evaluation result with the target quality evaluation result;
the second broadcasting unit 144 is further configured to determine the block to be identified as an uplink block if the first quality evaluation result is greater than the target quality evaluation result, perform accounting processing on the uplink block, delete the target block, and broadcast the uplink block to nodes in the blockchain network respectively;
the second broadcasting unit 144 is further configured to broadcast the target blocks to nodes in the blockchain network if the first quality evaluation result is equal to or less than the target quality evaluation result.
The specific functional implementation manner of the second broadcasting unit 144 and the fifth obtaining unit 145 may refer to step S208 in the corresponding embodiment of fig. 7, which is not described herein.
Referring again to fig. 13, the generating block module 14 may further include a sixth acquiring unit 146.
A sixth obtaining unit 146, configured to perform blockchain consensus on a block to be consensus when the target block fails to pass blockchain consensus and a block to be consensus broadcasted by the center node is obtained, where the block to be consensus is generated by the center node according to H submodel parameters, the H submodel parameters being different from the N submodel parameters;
The second broadcasting unit 144 is further configured to determine the block to be recognized as an uplink block if the target block does not pass the blockchain consensus when the block to be recognized passes the blockchain consensus, perform accounting processing on the uplink block, delete the target block, and broadcast the uplink block to nodes in the blockchain network respectively.
The specific functional implementation manner of the second broadcasting unit 144 and the sixth obtaining unit 146 may refer to step S208 in the corresponding embodiment of fig. 7, which is not described herein.
Referring again to fig. 13, the sixth acquisition unit 146 may include a third acquisition subunit 1461, a second evaluation subunit 1462, and a third determination subunit 1463.
A third obtaining subunit 1461, configured to obtain a to-be-consensus center model parameter and a first quality evaluation result in the to-be-consensus block;
A second evaluation subunit 1462, configured to perform model decision quality evaluation on the model parameters of the center to be consensus, so as to obtain a second quality evaluation result;
a third determining subunit 1463, configured to determine a difference between the first quality evaluation result and the second quality evaluation result, and compare the difference with a difference threshold;
The third determining subunit 1463 is further configured to, if the difference value is less than or equal to the difference threshold, determine that the consensus result for the block to be consensus passes;
The third determining subunit 1463 is further configured to, if the difference is greater than the difference threshold, determine that the consensus result for the block to be consensus fails.
The specific functional implementation manner of the third obtaining subunit 1461, the second evaluating subunit 1462, and the third determining subunit 1463 may refer to step S208 in the corresponding embodiment of fig. 7, which is not described herein.
In the embodiment of the application, the sub-model parameters respectively provided by N user nodes are regarded as N transactions, and the consensus process is designed to respectively carry out model decision quality assessment on the N sub-model parameters according to task unloading decision scenes, namely, the task unloading decision levels corresponding to the N sub-model parameters are assessed, further, the sub-model parameters with the quality assessment result meeting the model consensus condition are determined as target sub-model parameters, which are equivalent to the sub-model parameters with the reserved task unloading decision levels and the sub-model parameters with the low task unloading decision levels, further, the target sub-model parameters respectively provided by different user nodes are aggregated to obtain the center model parameters, and the aggregated center model parameters not only comprise the characteristics of the target sub-model parameters, but also have the task unloading capacity with the high level, further, the model decision quality assessment is carried out on the center model parameters to obtain the target quality assessment result, the target block is generated according to the center model parameters, when the target block passes through block chain consensus, the target block is respectively broadcast to the N user nodes, namely, the target block can be provided with the legal decision-making function of the target block, the user node is provided with the legal decision-making function, and the target block quality of the target block can be validated, and the legal decision block can be provided with the characteristics of the target block can be validated when the user node is validated, and the user node is provided with the target node has the legal decision-quality model corresponding function, with a high level of task offloading decision-making level. In summary, by providing the sub-model parameters by different user nodes (i.e. training the sub-model parameters by different user nodes in parallel), the generation speed of the task decision model can be increased, the time cost of data processing can be saved, and the training sample in each user node is not required to be directly obtained, so that the problem that the training sample is difficult to obtain can be avoided, and a plurality of sub-model parameters come from the training sample, so that the decision precision of the task decision model can be ensured, the target sub-model parameters are obtained through screening the sub-model parameters, the decision precision of the task decision model is further improved, and in addition, the central model parameters can be prevented from being stolen and tampered through the target block, and the safety of data processing can be further improved.
Further, referring to fig. 14, fig. 14 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown in fig. 14, the computer device 1000 may be a full scale node in the corresponding embodiment of fig. 3, and the computer device 1000 may include at least one processor 1001, such as a CPU, at least one network interface 1004, a user interface 1003, a memory 1005, and at least one communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display (Display), a Keyboard (Keyboard), and the network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others. The memory 1005 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 1005 may also optionally be at least one storage device located remotely from the aforementioned processor 1001. As shown in fig. 14, the memory 1005, which is one type of computer storage medium, may include an operating system, a network communication module, a user interface module, and a device control application.
In the computer device 1000 shown in fig. 14, the network interface 1004 may provide network communication functions, while the user interface 1003 is mainly used as an interface for providing input to a user, and the processor 1001 may be used to invoke a device control application program stored in the memory 1005 to realize:
n target information chains are obtained, wherein N is a positive integer, and each of the N target information chains comprises sub-model parameters which are respectively provided by N user nodes;
Respectively carrying out model decision quality evaluation on the N sub-model parameters, and determining the sub-model parameters with the quality evaluation result meeting the model consensus condition as target sub-model parameters, wherein the total number of the target sub-model parameters is less than or equal to N;
performing aggregation treatment on the target sub-model parameters to obtain center model parameters, and performing model decision quality assessment on the center model parameters to obtain a target quality assessment result;
and when the result of the validity verification indicates that the target block is a legal block, acquiring a task decision model comprising the central model parameter, wherein the task decision model is used for deciding a node for executing a task.
It should be understood that the computer device 1000 described in the embodiment of the present application may perform the description of the data processing method in the embodiment corresponding to fig. 3, 4b and 7, and may also perform the description of the data processing apparatus 1 in the embodiment corresponding to fig. 13, which is not repeated herein. In addition, the description of the beneficial effects of the same method is omitted.
The embodiment of the present application further provides a computer readable storage medium, where a computer program is stored, where the computer program includes program instructions, and when executed by a processor, implement the data processing method provided by each step in fig. 3, fig. 4b, and fig. 7, and specifically refer to the implementation manner provided by each step in fig. 3, fig. 4b, and fig. 7, which are not described herein again. In addition, the description of the beneficial effects of the same method is omitted.
The computer readable storage medium may be the data processing apparatus provided in any one of the foregoing embodiments or an internal storage unit of the computer device, for example, a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD), etc. that are provided on the computer device. Further, the computer-readable storage medium may also include both internal storage units and external storage devices of the computer device. The computer-readable storage medium is used to store the computer program and other programs and data required by the computer device. The computer-readable storage medium may also be used to temporarily store data that has been output or is to be output.
Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions, so that the computer device can execute the data processing method in the embodiments corresponding to fig. 3, fig. 4b, and fig. 7, which are not described herein. In addition, the description of the beneficial effects of the same method is omitted.
The terms first, second and the like in the description and in the claims and drawings of embodiments of the application are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the term "include" and any variations thereof is intended to cover a non-exclusive inclusion. For example, a process, method, apparatus, article, or device that comprises a list of steps or elements is not limited to the list of steps or modules but may, in the alternative, include other steps or modules not listed or inherent to such process, method, apparatus, article, or device.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The method and related apparatus provided in the embodiments of the present application are described with reference to the flowchart and/or schematic structural diagrams of the method provided in the embodiments of the present application, and each flow and/or block of the flowchart and/or schematic structural diagrams of the method may be implemented by computer program instructions, and combinations of flows and/or blocks in the flowchart and/or block diagrams. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or structural diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or structures.
The foregoing disclosure is illustrative of the present application and is not to be construed as limiting the scope of the application, which is defined by the appended claims.