Movatterモバイル変換


[0]ホーム

URL:


CN114037088A - A secure cross-domain model training method and system based on multi-party participation - Google Patents

A secure cross-domain model training method and system based on multi-party participation
Download PDF

Info

Publication number
CN114037088A
CN114037088ACN202111300581.7ACN202111300581ACN114037088ACN 114037088 ACN114037088 ACN 114037088ACN 202111300581 ACN202111300581 ACN 202111300581ACN 114037088 ACN114037088 ACN 114037088A
Authority
CN
China
Prior art keywords
data
model
participating nodes
participating
control node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111300581.7A
Other languages
Chinese (zh)
Inventor
顾见军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Digital Technology Co ltd
Original Assignee
Chengdu Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Digital Technology Co ltdfiledCriticalChengdu Digital Technology Co ltd
Priority to CN202111300581.7ApriorityCriticalpatent/CN114037088A/en
Publication of CN114037088ApublicationCriticalpatent/CN114037088A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

The invention relates to a safe cross-domain model training method and a safe cross-domain model training system based on multi-party participation, which comprises the following steps of firstly, carrying out data preprocessing on original data of participating nodes; then homomorphic encryption is carried out on the preprocessed data; transmitting the ciphertext data of the participating node to the main control node; the master control node performs combined model calculation by using the ciphertext data and the plaintext data; carrying out model optimization aiming at the calculation result; the model optimization parameters are then sent to all participating nodes. The method and the device make full use of the characteristics of multi-party calculation and the homomorphism characteristics of homomorphic encryption, encrypt the original data of the participating nodes into ciphertext data through multi-party calculation, perform combined model calculation with the plaintext data of the main control node, and broadcast the model optimization parameters to all the participating nodes, so that continuous optimization and iteration of the combined model of the main control node are continuously realized. In the whole calculation process of the combined model, all the participating nodes participate in the ciphertext data, so that the safety of the data is ensured, and the precision of the model is improved.

Description

Safe cross-domain model training method and system based on multi-party participation
Technical Field
The invention relates to the technical field of machine learning, in particular to a safe cross-domain model training method and system based on multi-party participation.
Background
The multi-party computation is a multi-party computation method aiming at the situation that the computation of multi-party collaboration is safely carried out under the condition that no trusted third party exists. Multi-party computing allows multiple data owners to perform collaborative computing on the basis of private data to extract the value of the data without revealing the original data of each data owner. With the rapid development of various emerging technologies such as cloud computing and artificial intelligence and the enhancement of data privacy and security protection, the role of multi-party computing in various fields becomes more and more important.
Homomorphic encryption provides a function of processing encrypted data, and besides basic encryption operation, the homomorphic encryption can also realize various calculation functions among ciphertexts, namely calculation before decryption can be equivalent to calculation after decryption. That is, others can process the encrypted data, but the process does not reveal any of the original content. Meanwhile, the user with the key decrypts the processed data to obtain the processed result.
At present, the common multi-node combined model training method comprises two types of cooperative machine learning and federal learning.
The cooperative machine learning technology is characterized in that data are trained on different nodes respectively to construct a combined model, the main process is that a participating user downloads a current prediction model firstly, then local training data are used for training and improving the model, improved model parameters are uploaded to a main control node in a safe encryption transmission mode, and the main control node automatically merges the latest model. This method of machine learning in collaborative mode overcomes the problem of training in a large number of centralized data sets, such that high-strength iterations require low-latency, high-throughput environments. But there is a very different environment in the environment of the cooperative mode: data is distributed on thousands of mobile terminals with different specifications, and the terminals have high network delay, low network throughput and even intermittent online time, so that continuous online cannot be guaranteed.
The other is a federal learning technology, in the existing federal learning technology, the feature processing mode is to perform feature processing on each client respectively, and original data interaction cannot be performed among the clients, so that the feature processing of the federal model cannot know the overall view of data and utilize complete data characteristics; for the model evaluation part, the solution is that each participant trains a model by using local training data, and test data is used for evaluating the generalization ability of the model, but different models can be obtained by different data set division modes under the method, namely, the problem that the model performance is sensitive to the data set division mode exists; for parameter adjustment, in the prior art, a hyper-parameter combination of a model is fixed, a federal model is obtained by training, then a group of hyper-parameter combinations is manually replaced, the model is obtained by continuous training, and finally, model effects obtained by comparing different parameters are compared to obtain an optimal parameter combination. Namely, the federal learning needs to be operated manually for many times, so that the problems of difficult model optimization and low efficiency exist.
No matter the collaborative machine learning technology or the federal learning technology, data calculation is carried out on each node, and the characteristic of centralized data is not provided, so that the whole appearance of the data cannot be known. Secondly, the two technologies are based on model training of plaintext data, an effective safety mechanism is not provided, and the problem of serious data leakage caused by plaintext data training cannot be solved; third, the above two techniques also cause a problem of poor model accuracy because they do not train on the full amount of data at the time of model training, but train on the basis of local data.
Disclosure of Invention
In order to solve the technical problems, the application provides a safe cross-domain model training method and system based on multi-party participation.
The application is realized by the following technical scheme:
a safe cross-domain model training method based on multi-party participation comprises the steps of firstly, carrying out data preprocessing on original data of participating nodes; then homomorphic encryption is carried out on the preprocessed data; the ciphertext data is transmitted through the network through the cooperative communication module; the master control node section cooperative communication module receives communication data of the participating nodes and processes the data; performing combined model calculation on the ciphertext data and the main control node data, and performing model optimization aiming at a calculation result; and sending the parameter information of the model optimization to a message server, and sending the model optimization parameters to all the participating nodes by the message server.
The method and the device make full use of the characteristics of multi-party calculation and the homomorphism of homomorphic encryption, encrypt the original data of the participating nodes into ciphertext data through multi-party calculation, perform joint model calculation with the plaintext data of the main control node, and broadcast the model optimization parameters to all the participating nodes through the message server, so that the continuous optimization and iteration processes of the joint model of the main control node are continuously realized, and in the calculation process of the whole joint model, the participating nodes participate in the ciphertext data, and the safety of the data is ensured.
Compared with the prior art, the method has the following beneficial effects:
in the whole model joint training process, the homomorphic encryption characteristic is fully utilized, homomorphic encryption of the original data of the participating nodes is firstly realized, then the encrypted data of the participating nodes and the plaintext data of the main control node are subjected to joint model training, and the data of the participating nodes are ciphertext data subjected to homomorphic encryption in the whole joint model training process, so that the data safety of the participating nodes is ensured;
according to the method, the original data of the participating nodes are encrypted into ciphertext data through multi-party calculation by utilizing the characteristics of multi-party calculation and the homomorphism of homomorphic encryption, the joint model calculation is carried out on the ciphertext data and the plaintext data of the main control node, and model optimization parameters are broadcasted to all the participating nodes through the message server, so that continuous optimization and iteration of the joint model of the main control node are realized, and the precision of the model is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention.
FIG. 1 is a flow chart of a method for cross-domain model training based on multi-party participation security in an embodiment of the present invention;
FIG. 2 is a flow chart of data preprocessing according to an embodiment of the present invention;
FIG. 3 is a flow chart of homomorphic encryption according to an embodiment of the present invention;
FIG. 4 is a flow chart of cooperative communication in an embodiment of the present invention;
FIG. 5 is a flow chart of data processing of a master node in an embodiment of the present invention;
FIG. 6 is a flowchart of joint modeling computation of a master node in an embodiment of the present invention;
FIG. 7 is a flow chart of model optimization of a master node in an embodiment of the present invention;
fig. 8 is a message server communication flow diagram of a master node in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments. It is to be understood that the described embodiments are only a few embodiments of the present invention, and not all embodiments.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
In addition, the embodiments of the present invention and the features of the embodiments may be combined with each other without conflict. For the device-like embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
As shown in fig. 1 to fig. 8, the method for training a secure cross-domain model based on multi-party participation disclosed in this embodiment includes the following steps:
step S101, each participating node preprocesses original data according to the requirement of master control node combined model training; the method specifically comprises the following steps:
step S10101, inputting the original data into a data analyzer to analyze the original data of the participating nodes, and analyzing and classifying the data according to three categories of structured data, semi-structured data and unstructured data;
and step S10102, inputting the data after data analysis into a data converter, and performing data conversion on the analyzed data according to the requirements of the joint model calculation.
Step S102, a homomorphic encryption module is carried out on the preprocessed data to form ciphertext data; the step utilizes the homomorphism of homomorphic encryption, and assumes that the encryption function and the decryption function of an encryption system are respectively
Figure BDA0003338211570000041
And
Figure BDA0003338211570000042
wherein
Figure BDA0003338211570000043
And C are a plaintext space and a ciphertext space, respectively; order to
Figure BDA0003338211570000044
And
Figure BDA0003338211570000045
algebraic or arithmetic operations defined in plaintext space and ciphertext space, respectively. The homomorphism of the encryption scheme is defined as: given arbitrary two
Figure BDA0003338211570000046
Figure BDA0003338211570000047
If the encryption function and the decryption function of an encryption system satisfy an algebraic relationship
Figure BDA0003338211570000048
Or
Figure BDA0003338211570000049
The encryption system is said to be homomorphic.
Step S103, the cooperative communication module of each participating node prepares before multiparty computation, firstly, ciphertext data is loaded into the cooperative communication module, and then computation information and addresses are continuously loaded.
And step S104, carrying out network transmission on the ciphertext data.
And step S105, the cooperative communication module of the main control node receives the ciphertext data of the participating node.
Step S106, the master control node performs data processing before joint modeling calculation on the received ciphertext data of the participating nodes; the method comprises the following specific steps: after receiving the ciphertext data of the participating nodes, the data processing module firstly analyzes the ciphertext data, analyzes the data according to a protocol format, then performs data conversion, and performs corresponding data conversion according to the requirement of the joint model calculation.
And S107, performing combined model training by using the ciphertext data of the participating node and the plaintext data of the main control node, and then obtaining a training result.
Step S108, carrying out model optimization according to the calculation result of the combined model;
in the model optimization process, firstly, a model training result is evaluated, the quality of a training model is evaluated according to preset evaluation parameters of various models, and then parameters of the good model are obtained.
Step S109, sending the parameter information of model optimization to a message server, and sending the model optimization parameters to all the participating nodes by the message server;
based on the above cross-domain model training method, the present application also discloses a safe cross-domain model joint training system, which can be used to implement the above method, and the cross-domain model joint training system includes:
the data preprocessing module is used for preprocessing the original data of the participating nodes;
the homomorphic encryption module is used for homomorphic encryption of the preprocessed data;
the cooperative communication module of the participating node is used for transmitting the ciphertext data of the participating node to the main control node;
the master control node cooperative communication module is used for receiving ciphertext data of the participating nodes;
the data processing module is used for processing the received ciphertext data;
the joint model training module is used for performing joint model training by using the ciphertext data of the participating nodes and the plaintext data of the main control node;
the model optimization module is used for optimizing the model;
and the message server is used for sending the model optimization parameters to all the participating nodes.
The message server firstly processes the messages sent by the model optimization module, classifies the messages according to message types, and then sends the messages to the participating nodes according to message requirements.
According to the method and the device, the security of the data participating in calculation is ensured by homomorphic encryption of the multi-party data participating in the joint modeling, the joint modeling of the multi-party ciphertext data and the main control node plaintext data is realized based on the homomorphic encryption characteristic, and the safe joint model training is realized on the premise of ensuring the security and privacy of the data of each party.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The above embodiments are provided to explain the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above embodiments are merely exemplary embodiments of the present invention and are not intended to limit the scope of the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (8)

1. A safe cross-domain model training method based on multi-party participation is characterized in that: the method comprises the following steps:
each participating node performs data preprocessing on the original data;
carrying out homomorphic encryption on the preprocessed data;
transmitting the ciphertext data of the participating node to the main control node;
the master control node performs combined model training by using plaintext data of the master control node and ciphertext data of the participating nodes;
optimizing the model according to the training result;
and sending the model optimization parameters to all the participating nodes.
2. The method of claim 1, wherein the method comprises: the data preprocessing comprises the following steps:
analyzing and classifying the original data of the participating nodes according to three categories of structured data, semi-structured data and unstructured data;
and carrying out data conversion on the analyzed data according to the requirements of the joint model calculation.
3. The method of claim 1, wherein the method comprises: and after receiving the ciphertext data of the participating nodes, the data processing module firstly analyzes the ciphertext data according to a protocol format and then performs corresponding data conversion according to the calculation requirement of the combined model.
4. The method of claim 1, wherein the method comprises: and sending the model optimization parameters to all the participating nodes by using a message server.
5. A safe cross-domain model joint training system is characterized in that: the method comprises the following steps:
the data preprocessing module is used for preprocessing the original data of the participating nodes;
the homomorphic encryption module is used for homomorphic encryption of the preprocessed data;
the cooperative communication module of the participating node is used for transmitting the ciphertext data of the participating node to the main control node;
the master control node cooperative communication module is used for receiving ciphertext data of the participating nodes;
the data processing module is used for processing the received ciphertext data;
the joint model training module is used for performing joint model training by using the ciphertext data of the participating nodes and the plaintext data of the main control node;
the model optimization module is used for optimizing the model;
and the message server is used for sending the model optimization parameters to all the participating nodes.
6. The system of claim 5, wherein the system further comprises: the data preprocessing module comprises a data parser and a data converter.
7. The system of claim 5, wherein the system further comprises: the data processing modules respectively comprise a data parser and a data converter.
8. The system of claim 5, wherein the system further comprises: and the message server is responsible for processing the messages pushed by the model optimization module, classifying the messages according to message types, and then sending the messages to the participating nodes according to message requirements.
CN202111300581.7A2021-11-042021-11-04 A secure cross-domain model training method and system based on multi-party participationPendingCN114037088A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202111300581.7ACN114037088A (en)2021-11-042021-11-04 A secure cross-domain model training method and system based on multi-party participation

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202111300581.7ACN114037088A (en)2021-11-042021-11-04 A secure cross-domain model training method and system based on multi-party participation

Publications (1)

Publication NumberPublication Date
CN114037088Atrue CN114037088A (en)2022-02-11

Family

ID=80142762

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202111300581.7APendingCN114037088A (en)2021-11-042021-11-04 A secure cross-domain model training method and system based on multi-party participation

Country Status (1)

CountryLink
CN (1)CN114037088A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN119378712A (en)*2024-12-272025-01-28中国电信股份有限公司 Cross-domain model training method, device, network device, readable storage medium and program product

Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109684855A (en)*2018-12-172019-04-26电子科技大学A kind of combined depth learning training method based on secret protection technology
CN112187443A (en)*2020-10-132021-01-05成都数融科技有限公司Citizen data cross-domain security joint calculation method and system based on homomorphic encryption
CN112183730A (en)*2020-10-142021-01-05浙江大学 A training method of neural network model based on shared learning
CN112232518A (en)*2020-10-152021-01-15成都数融科技有限公司Lightweight distributed federated learning system and method
CN113515760A (en)*2021-05-282021-10-19平安国际智慧城市科技股份有限公司Horizontal federal learning method, device, computer equipment and storage medium
CN113553610A (en)*2021-09-222021-10-26哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院)Multi-party privacy protection machine learning method based on homomorphic encryption and trusted hardware

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109684855A (en)*2018-12-172019-04-26电子科技大学A kind of combined depth learning training method based on secret protection technology
CN112187443A (en)*2020-10-132021-01-05成都数融科技有限公司Citizen data cross-domain security joint calculation method and system based on homomorphic encryption
CN112183730A (en)*2020-10-142021-01-05浙江大学 A training method of neural network model based on shared learning
CN112232518A (en)*2020-10-152021-01-15成都数融科技有限公司Lightweight distributed federated learning system and method
CN113515760A (en)*2021-05-282021-10-19平安国际智慧城市科技股份有限公司Horizontal federal learning method, device, computer equipment and storage medium
CN113553610A (en)*2021-09-222021-10-26哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院)Multi-party privacy protection machine learning method based on homomorphic encryption and trusted hardware

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN119378712A (en)*2024-12-272025-01-28中国电信股份有限公司 Cross-domain model training method, device, network device, readable storage medium and program product

Similar Documents

PublicationPublication DateTitle
Koti et al.{SWIFT}: Super-fast and robust {Privacy-Preserving} machine learning
Bell et al.Secure single-server aggregation with (poly) logarithmic overhead
Huang et al.Deep learning for physical-layer 5G wireless techniques: Opportunities, challenges and solutions
Wang et al.Simulating of the measurement-device independent quantum key distribution with phase randomized general sources
Li et al.Improving the performance of practical decoy-state quantum key distribution with advantage distillation technology
CN112232518B (en)Lightweight distributed federal learning system and method
CN107135061B (en) A Distributed Privacy Preserving Machine Learning Method Under 5G Communication Standard
Ni et al.Multi-party dynamic state estimation that preserves data and model privacy
CN112818369B (en) A joint modeling method and device
CN115001720B (en)Optimization method, device, medium and equipment for safe transmission of federal learning modeling
CN114614983B (en) A feature fusion privacy protection method based on secure multi-party computation
CN113762752B (en) Construction quality traceability system based on blockchain
Yao et al.Quantum sampling for finite key rates in high dimensional quantum cryptography
Barceló-Armada et al.Amazon Alexa traffic traces
CN114037088A (en) A secure cross-domain model training method and system based on multi-party participation
Ding et al.A quantum multiparty packing lemma and the relay channel
Ingle et al.SERAV Deep-MAD: deep learning-based security–reliability–availability aware multiple D2D environment
Vasconcelos et al.Optimal remote estimation of discrete random variables over the collision channel
Voloshyn et al.Analysis of nist lightweight cryptographic algorithms performance in iot security environments based on mqtt
CN110635896A (en) A Blind Parameter Estimation Method for Continuous Variable Quantum Key Distribution
CN111914281B (en)Bayesian model training method and device based on blockchain and homomorphic encryption
Zhou et al.A survey of security aggregation
Li et al.Security of semi-device-independent random number expansion protocols
CN116506227A (en)Data processing method, device, computer equipment and storage medium
Winkler et al.On the efficiency of classical and quantum secure function evaluation

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp