CN112560059A

Movatterモバイル変換

Info

Publication number: CN112560059A
Application number: CN202011499140.XA
Authority: CN
Inventors: 陈晋音; 吴长安; 张龙源; 金海波
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2020-12-17
Filing date: 2020-12-17
Publication date: 2021-03-26
Anticipated expiration: 2040-12-17
Also published as: CN112560059B

Abstract

Translated fromChinese

本发明公开了一种基于神经通路特征提取的垂直联邦下模型窃取防御方法，包括：(1)将数据集中的每张样本平均分成两部分，组成样本集D_A和D_B，且仅样本集D_B包含样本标签；(2)依据D_A对边缘终端P_B的边缘模型M_A进行训练，依据D_B对边缘终端P_B的边缘模型M_B进行训练，P_A将训练过程产生的特征数据发送给P_B，P_B利用接收的特征数据和激活神经元通路数据计算损失函数，P_A和P_B并将各自的损失函数掩码加密后上传至服务端；(3)服务端对上传的损失函数掩码解密并聚合后求解聚合的损失函数获得M_A和M_B的梯度信息，并返回梯度信息至P_A和P_B以更新边缘模型网络参数。来提高在垂直联邦场景下边缘模型的信息安全性。

_The invention discloses_a model stealing defense method under vertical federation based on neural pathway feature extraction. DB contains sample labels; (2₎ the edge model_MA of the edge terminal_PB is trained according to DA, the edge model_MB of the edge terminal_PB is trained according to_DB , and the feature data generated by the training process is trained by_P_A Send to P_B , P_B uses the received feature data and activated neuron pathway data to calculate the loss function, P_A and P_B encrypt their respective loss function masks and upload them to the server; (3) The server pairs the uploaded After the loss function mask is decrypted and aggregated, the aggregated loss function is solved to obtain the gradient information of M_A and M_B , and the gradient information is returned to P_A and P_B to update the network parameters of the edge model. To improve the information security of edge models in vertical federation scenarios.

Description

Vertical federal model stealing defense method based on neural pathway feature extraction

Technical Field

The invention belongs to the field of safety defense, and particularly relates to a vertical federal model stealing defense method based on neural pathway feature extraction.

Background

In recent years, deep learning models have been widely used for various realistic tasks and have achieved good results. Meanwhile, data islanding and privacy disclosure in the model training and application process become main problems which hinder the development of artificial intelligence technology at present. To address this problem, federal learning has emerged as an efficient means of privacy protection. The federal learning is a distributed machine learning method, namely a learning method that a participant uploads updated parameters to a server after training local data, and the server aggregates the updated parameters to obtain overall parameters, and a lossless learning model is trained through local training and parameter transmission of the participant.

Federal learning can be roughly divided into three categories according to different situations of data distribution: horizontal federal learning, vertical federal learning, and federal migratory learning. The horizontal federated learning refers to that under the condition that data features are overlapped more and users are overlapped less among different data sets, the data sets are segmented according to user dimensions, and the data with the same data features and not identical users is extracted for training. Longitudinal federated learning refers to that under the condition that users overlap more and data features overlap less among different data sets, the data sets are segmented according to data feature dimensions, and the data with the same users and the data features which are not identical are extracted for training. Federal transfer learning refers to the situation where users of multiple data sets have little overlap with data features, data is not segmented, but transfer learning is utilized to overcome data or tag deficiencies.

Compared with the traditional machine learning technology, the federal learning can improve the learning efficiency, solve the problem of data islands and protect the privacy of local data. However, a plurality of potential safety hazards exist in federal learning, and three main threats to attacks in federal learning are as follows: poisoning attacks, countering attacks, and privacy disclosure. The privacy disclosure problem is the most important problem in the context of federal learning, because federal learning involves model information interaction of a plurality of participants, in the process, the model information interaction is easily attacked maliciously, and the privacy security of the model for the federal learning is greatly threatened.

In a vertical federal scene, in order to protect the privacy security of a depth model, the proposed main privacy protection technology comprises safe multi-party calculation, homomorphic encryption and differential privacy protection, the computation complexity of the safe multi-party calculation and homomorphic encryption technology can be greatly increased, the time cost and the computation cost can be improved, the computation force requirement on equipment is also high, the differential privacy protection technology needs to realize the privacy security protection by adding noise, and the accuracy of the model on the original task can be influenced.

Disclosure of Invention

In order to improve the information security of the edge model under a vertical federal scene and prevent the edge model from being stolen by a malicious attacker in the information transmission process, the invention provides a vertical federal model stealing defense method based on neural pathway feature extraction.

The technical scheme of the invention is as follows:

a vertical federal model stealing defense method based on neural pathway feature extraction comprises the following steps:

(1) dividing each sample in the data set into two parts to form a sample set D_AAnd sample set D_BAnd only the sample set D_BContaining a sample label, sample set D_A、D_BTo the edge terminal P_AAnd an edge terminal P_B；

(2) According to sample set D_AFor edge terminal P_BEdge model M of_ATraining is carried out according to a sample set D_BFor edge terminal P_BEdge model M of_BTraining is performed, edge terminal P_ASending the characteristic data generated in the training process to P_B，P_BComputing a loss function using the received feature data and the activated neuron path data, the edge terminal P_AAnd P_BEncrypting the respective loss function masks and uploading the encrypted loss function masks to a server;

(3) service end to edge terminal P_AAnd P_BAfter the uploaded loss function mask is decrypted, the loss function is aggregated, and then the aggregated loss function is solved to obtain M_AAnd M_BAnd returning the gradient information to the edge terminal P_AAnd P_BTo update the edge model network parameters.

Compared with the prior art, the invention has the beneficial effects that at least:

according to the stealing and defending method for the model under the vertical federation based on the neural pathway feature extraction, the neural pathway feature is fixed during training, and the loss function is encrypted and uploaded, so that a malicious attacker is prevented from stealing the model under the vertical federation scene, and the information security of the edge model under the vertical federation scene is prevented.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

Fig. 1 is a flowchart of a vertical federal model theft defense method based on neural pathway feature extraction according to an embodiment of the present invention;

fig. 2 is a schematic diagram of training of a vertical federal model provided in an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.

Aiming at the problem that a model of a lower edge end in a vertical federal scene is vulnerable to a malicious attacker in the process of model information interaction, after the attacker steals model information of the edge end, the model of the edge end is stolen through calculation of gradients and loss values. In order to prevent the stealing of the edge model, the embodiment of the invention provides a model stealing defense scheme under the vertical federation based on neural pathway feature extraction, a neural pathway feature extraction step is added in a training stage of the edge model, and a model parameter transmission process in the training stage is encrypted by a method for fixedly activating neurons, so that a malicious party is effectively prevented from stealing the privacy information of the depth model in the process of model parameter exchange of different edge terminals under the vertical federation scene, and an attacker cannot restore the training process of the model even if the transmission information of the edge model is stolen under the premise of not unlocking the fixed neural pathway, thereby achieving the purposes of protecting the model information and defending against model stealing attack.

Fig. 1 is a flowchart of a vertical federal model theft defense method based on neural pathway feature extraction according to an embodiment of the present invention. As shown in fig. 1, the method for protecting from stealing of a model under a vertical federation based on neural pathway feature extraction provided by the embodiment includes the following steps:

step 1, data set division and alignment.

In an embodiment, an MNIST dataset, a CIFAR-10 dataset, and an ImageNet dataset are employed. The MNIST data set comprises ten types of training sets, 6000 samples of each type, ten types of testing sets and 1000 samples of each type; the training set of the CIFAR-10 data set comprises ten types, 5000 samples of each type, ten types of the test set and 1000 samples of each type; the ImageNet data sets are 1000 types, each type comprises 1000 samples, 30% of pictures in each type are randomly extracted to serve as a test set, and the rest pictures serve as training sets.

In the present invention, two edge terminals P are used under the vertical federation_AAnd P_BIn the vertical federal scenario, two edge terminals P_AAnd P_BThe data of (2) have different data characteristics, so that the preprocessed data set needs to be subjected to characteristic segmentation. Averagely dividing each sample image in the MNIST data set, the CIFAR-10 data set and the ImageNet data set into two parts which are respectively used as a sample set D_AAnd sample set D_BWherein the sample set D_BA sample classmark containing the sample image.

In an embodiment, the samples are divided to obtain a sample set D_AAnd sample set D_BThen, the sample set D is also needed_AAnd sample set D_BIn which the partial samples derived from the same sample are aligned, i.e. the edge model M is guaranteed_AAnd edge model M_BThe partial samples of the same input are derived from the same sample.

Due to edge termination P under vertical federal scenario_AAnd P_BAre different, while ensuring a sample set D of different edge terminals_AAnd D_BThe method has the advantages that the original data of the two partial images belonging to the same sample image are aligned by adopting an encryption-based user ID alignment technology, so that the partial image data used each time come from the same sample image in the training process of the two terminals, and the users of any edge terminal cannot be exposed in the process of data entity alignment.

And 2, the edge terminal trains respective edge models by using respective sample sets, encrypts respective loss functions by masks and uploads the loss functions to the server.

In an embodiment, according to sample set D_AFor edge terminal P_BEdge model M of_ATraining is carried out according to a sample set D_BFor edge terminal P_BEdge model M of_BTraining is performed, edge terminal P_ASending the characteristic data generated in the training process to P_B，P_BComputing a loss function using the received feature data and the activated neuron path data, the edge terminal P_AAnd P_BAnd the respective loss function masks are encrypted and uploaded to the server.

For different data sets, both edge terminals are usedAnd (3) training the same model structure, and for the Imagnet data set, training and setting unified hyper-parameters by using an ImageNet pre-trained model: using a random gradient descent (SGD), adam optimizer, learning rate of η, regularization parameter of λ, data set

Where i denotes a certain sample data, y_iThe original label representing the corresponding sample,

and

the feature spaces respectively representing data, and the model parameters related to the feature spaces are represented by theta_AAnd Θ_BThe model training target is expressed as:

in particular, according to the sample set D_AFor edge model M_AIn training, the edge model M_ALoss function Loss of_AComprises the following steps:

wherein, theta_ARepresenting an edge model M_AThe model parameters of (a) are determined,

represents the ith sample belonging to the sample set A, | · | | non-calculation²Representing the square of the norm of L1.

According to sample set D_BFor edge model M_BIn training, the edge model M_BTotal Loss function Loss of_sumComprises the following steps:

loss_sum＝loss_B+λ*loss_topk+loss_AB

therein, loss_BRepresenting an edge model M_BLoss of_topkIndicating neural pathway loss, loss_ABDenotes the common loss, and λ denotes the adaptive adjustment coefficient as a partial factor of the neural pathway encryption, Θ_BRepresenting an edge model M_BThe model parameters of (a) are determined,

denotes the i-th sample, y, belonging to the sample set B_iTo represent

Corresponding label, | · | | non-conducting phosphor²Denotes the square of the L1 norm, i denotes the sample index, N is the number of samples, NUPath_l(T, N) represents the activation values of a plurality of maximum activation neurons of the L-th layer of the edge model, L represents the total number of layers of the edge model, T is the number of samples input each time, and N represents the number of neurons of each layer.

In the embodiment, a neural pathway is defined by taking any neuron in an input layer in the neural network as a starting point, taking any neuron in an output layer as an end point, taking information flow of data as a direction, and passing through communication paths of a plurality of neurons in a hidden layer. The neural pathway represents the connection relationship between neurons, and when a sample is input into the model to activate a specific neuron, the pathway formed by the neurons in the activated state is called an activated neural pathway.

When the neural pathway is fixed, in the training process of the edge end model, after each round of training is finished, randomly selecting samples from the test set of the data set selected in the step 1 as samples to be input into the training model, and obtaining the maximum activation neural pathway of the model at the moment: let N be { N ═ N₁,n₂,...n_nIs a set of neurons of the deep learning model; let T ═ x₁,x₂,...x_nIs the input of a set of test sets;

in order to be a function of the function,

representing given input samples

In the first layer and the input sample

Corresponding neuron n_iActivation value of, max_k(. cndot.) represents the extraction of activation values for k neurons k large before the activation value in each layer. The maximum activation neural pathway is defined as follows:

during training, a maximum activation neural channel composed of activation values of a plurality of maximum activation neurons is fixed, namely the activation values of the neurons are unchanged, and the activation values of k neurons in each neural layer are accumulated to form a path loss function.

Step 3, the service end carries out the edge terminal P_AAnd P_BAfter the uploaded loss function mask is decrypted, the aggregation loss function obtains gradient information and returns the gradient information to the edge terminal P_AAnd P_BTo update the edge model network parameters.

In this embodiment, the service end is to the edge terminal P_AAnd P_BAfter the uploaded loss function mask is decrypted, the loss function is aggregated, and then the aggregated loss function is solved to obtain M_AAnd M_BAnd returns the gradient information to the edge terminal P_AAnd P_B. Specifically, the server side adopts random gradient descent to solve gradient information of the aggregated loss function. The Loss function Loss of the server side aggregation is as follows:

Loss＝loss_B+λ*loss_topk+loss_AB+loss_A

M_Aand M_BRespectively is

And

edge terminal P_AAnd P_BAfter gradient information returned by the server is received, updating M of each edge model according to the gradient information_AAnd M_BBased on the updated new network parameters, the training is resumed.

Aiming at model stealing attack in a vertical federal scene, the method for preventing model stealing in the vertical federal based on neural pathway feature extraction provided by the embodiment fixes and encrypts the neural pathway feature in the training process of the edge end model, so that the model is prevented from being stolen by a malicious attacker in the gradient and loss information transmission process of the edge model, and the model is stolen. The information of the model is encrypted and protected from the aspect of feature extraction, so that the privacy and the safety of the model are protected while the model training efficiency is improved.

The above-mentioned embodiments are intended to illustrate the technical solutions and advantages of the present invention, and it should be understood that the above-mentioned embodiments are only the most preferred embodiments of the present invention, and are not intended to limit the present invention, and any modifications, additions, equivalents, etc. made within the scope of the principles of the present invention should be included in the scope of the present invention.

Claims

1. A vertical federal model stealing defense method based on neural pathway feature extraction is characterized by comprising the following steps:

2. The vertical federal model theft defense method based on neural pathway feature extraction as claimed in claim 1, wherein the method is based on a sample set D_AFor edge model M_AIn training, the edge model M_ALoss function Loss of_AComprises the following steps:

3. The vertical federal model theft defense method based on neural pathway feature extraction as claimed in claim 1, wherein the method is based on a sample set D_BFor edge model M_BIn training, the edge model M_BTotal Loss function Loss of_sumComprises the following steps:

loss_sum＝loss_B+λ*loss_topk+loss_AB

denotes the i-th sample, y, belonging to the sample set B_iTo represent

Corresponding label, | · | | non-conducting phosphor²Denotes the square of the L1 norm, i denotes the sample index, N is the number of samples, NUPath_l(T, N) denotes multiple maximum activations of the l-th layer of the edge modelThe activation value of the neuron, L represents the total number of layers of the edge model, T is the number of samples input each time, and N represents the number of the neuron of each layer.

4. The method for vertical federal model theft defense based on neural pathway feature extraction as claimed in claim 3, wherein NUPath_lThe formula for calculating (T, N) is:

wherein,

in order to be a function of the function,

representing given input samples

In the first layer and the input sample

Corresponding neuron n_iActivation value of, max_k(. cndot.) represents the extraction of activation values for k neurons k large before the activation value in each layer.

5. The vertical federal model theft defense method based on neural pathway feature extraction as claimed in claim 1, wherein in the step (3), the Loss function Loss of the service-side aggregation is:

Loss＝loss_B+λ*loss_topk+loss_AB+loss_A

M_Aand M_BRespectively is

And

6. the vertical federal model theft defense method based on neural pathway feature extraction as claimed in claim 1, wherein the edge terminal P is_AAnd P_BAfter gradient information returned by the server is received, updating M of each edge model according to the gradient information_AAnd M_BBased on the updated new network parameters, the training is resumed.

7. The vertical federal model theft defense method based on neural pathway feature extraction as claimed in claim 1, wherein samples are divided to obtain a sample set D_AAnd sample set D_BThen, the sample set D is also needed_AAnd sample set D_BIn which the partial samples derived from the same sample are aligned, i.e. the edge model M is guaranteed_AAnd edge model M_BThe partial samples of the same input are derived from the same sample.

8. The vertical federal model theft defense method based on neural pathway feature extraction as claimed in claim 1, wherein the server side adopts random gradient descent to solve gradient information of aggregated loss functions.