623Accesses
1Altmetric
Abstract
Federated learning is an emerging paradigm that enables multiple organizations to jointly train a model without revealing their private data. As an important variant,vertical federated learning (VFL) deals with cases in which collaborating organizations own data of the same set of users but with disjoint features. It is generally regarded that VFL is more secure than horizontal federated learning. However, recent research (USENIX Security’22) reveals that it is still possible to conduct label inference attacks in VFL, in which attacker can acquire privately owned labels of other participants; even VFL constructed with model splitting (the kind of VFL architecture with higher security guarantee) cannot escape it. To solve this issue, in this paper, we propose the dispersed training framework. It utilizes secret sharing to break the correlations between the bottom model and the training data. Accordingly, even if the attacker receives the gradients in the training phase, he is incapable to deduce the feature representation of labels from the bottom model. Besides, we design a customized model aggregation method such that the shared model can be privately combined, and the linearity of secret sharing schemes ensures the training accuracy to be preserved. Theoretical and experimental analyses indicate the satisfactory performance and effectiveness of our framework.
This is a preview of subscription content,log in via an institution to check access.
Access this article
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
Price includes VAT (Japan)
Instant access to the full article PDF.





Similar content being viewed by others
Availability of Supporting Data
The data used to support the findings of this study are available from the second author upon request.
References
Voigt, P., Von dem Bussche, A.: The EU general data protection regulation (GDPR. A Practical Guide, 1st Ed., Cham: Springer International Publishing10(3152676), 10–5555 (2017)
Hoofnagle, C.J., van der Sloot, B., Borgesius, F.Z.: The european union general data protection regulation: what it is and what it means. Inform. Commun. Technol. Law28(1), 65–98 (2019)
Chik, W.B.: The singapore personal data protection act and an assessment of future trends in data privacy reform. Comput. Law Secur. Rev.29(5), 554–575 (2013)
Shatz, S., Chylik, S.E.: The california consumer privacy act of 2018: A sea change in the protection of california consumers. The Business Lawyer75 (2020)
Hu, H., Salcic, Z., Sun, L., Dobbie, G., Yu, P.S., Zhang, X.: Membership inference attacks on machine learning: A survey. ACM Comput. Surv. (CSUR)54(11s), 1–37 (2022)
Fu, C., Zhang, X., Ji, S., Chen, J., Wu, J., Guo, S., Zhou, J., Liu, A.X. Wang, T.: Label inference attacks against vertical federated learning. In: 31st USENIX Security Symposium (USENIX Security 22), Boston, MA (2022)
McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282, PMLR (2017)
Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: Concept and applications. ACM Trans. Intell. Syst. Technol. (TIST)10(2), 1–19 (2019)
Liu, Y., Kang, Y., Xing, C., Chen, T., Yang, Q.: A secure federated transfer learning framework. IEEE Intell. Syst.35(4), 70–82 (2020)
Vepakomma, P., Gupta, O., Swedish, T., Raskar, R.: Split learning for health: Distributed deep learning without sharing raw patient data,arXiv:1812.00564 (2018)
Van der Maaten, L., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res.9(11) (2008)
Yuan, F., Chen, S., Liang, K., Xu, L.: Research on the coordination mechanism of traditional Chinese medicine medical record data standardization and characteristic protection under big data environment, vol. 1 of1. No.517 Shungong Road, Shizhong District, Jinan, Shandong Province, China: Shandong:Shandong People’s Publishing House, 1 ed., (2021)
Chen, C., Huang, T.: Camdar-adv: generating adversarial patches on 3d object. Int. J. Intell. Syst.36(3), 1441–1453 (2021)
Jiang, N., Jie, W., Li, J., Liu, X., Jin, D.: Gatrust: A multi-aspect graph attention network model for trust assessment in osns. IEEE Transactions on Knowledge and Data Engineering (2022)
Yan, H., Chen, M., Hu, L., Jia, C.: Secure video retrieval using image query on an untrusted cloud. Appl. Soft Comput.97, 106782 (2020)
Ai, S., Hong, S., Zheng, X., Wang, Y., Liu, X.: Csrt rumor spreading model based on complex network. Int. J. Intell. Syst.36(5), 1903–1913 (2021)
Li, T., Wang, Z., Chen, Y., Li, C., Jia, Y., Yang, Y.: Is semi-selfish mining available without being detected? Int. J. Intell. Syst. (2021).https://doi.org/10.1002/int.22656
Li, T., Wang, Z., Chen, Y., Li, C., Jia, Y., Yang, Y.: Is semi-selfish mining available without being detected?. International Journal of Intelligent Systems (2021)
Zhang, X., Wang, T.: Elastic and reliable bandwidth reservation based on distributed traffic monitoring and control. IEEE Transactions on Parallel and Distributed Systems (2022)
Zhang, X., Wang, Y., Geng, G., Yu, J., Delay-optimized multicast tree packing in software-defined networks. IEEE Transactions on Services Computing (2021)
Konečnỳ, J., McMahan, H.B., Ramage, D., Richtárik, P.: Federated optimization: Distributed machine learning for on-device intelligence.arXiv:1610.02527 (2016)
Konečnỳ, J., McMahan, H.B., Yu, F.X., Richtárik, P., Suresh, A.T., Bacon, D.: Federated learning: Strategies for improving communication efficiency.arXiv:1610.05492 (2016)
McMahan, H.B., Moore, E., Ramage, D., y Arcas, B. A.: Federated learning of deep networks using model averaging. vol. 2,arXiv:1602.05629 (2016)
Bonawitz, K., Ivanov, V., Kreuter, B., Marcedone, A., McMahan, H.B., Patel, S., Ramage, D., Segal, A., Seth, K.: Practical secure aggregation for privacy-preserving machine learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1175–1191 (2017)
Shokri, R., Shmatikov, V.: Privacy-preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1310–1321 (2015)
Du, W., Atallah, M.J.: Privacy-preserving cooperative statistical analysis. In: Seventeenth Annual Computer Security Applications Conference, pp. 102–110. IEEE (2001)
Du, W., Han, Y.S., Chen, S.: Privacy-preserving multivariate statistical analysis: Linear regression and classification. In: Proceedings of the 2004 SIAM International Conference on Data Mining, pp. 222–233. SIAM (2004)
Sanil, A.P., Karr, A.F., Lin, X., Reiter, J.P.: Privacy preserving regression modelling via distributed computation. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 677–682 (2004)
Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 639–644 (2002)
Wan, L., Ng, W.K., Han, S., Lee, V.C.: Privacy-preservation for gradient descent methods. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 775–783 (2007)
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowledge Data Eng22(10), 1345–1359 (2009)
Tianqing, Z., Zhou, W., Ye, D., Cheng, Z., Li, J.: Resource allocation in iot edge computing via concurrent federated reinforcement learning. IEEE Internet of Things Journal9(2), 1414–1426 (2021)
Hu, L., Yan, H., Li, L., Pan, Z., Liu, X., Zhang, Z.: Mhat: an efficient model-heterogenous aggregation training scheme for federated learning. Inform. Sci.560, 493–503 (2021)
Mo, K., Huang, T., Xiang, X.: Querying little is enough: Model inversion attack via latent information. In: International Conference on Machine Learning for Cyber Security, pp. 583–591. Springer (2020)
Ren, H., Huang, T., Yan, H.: Adversarial examples: attacks and defenses in the physical world. International Journal of Machine Learning and Cybernetics12(11), 3325–3336 (2021)
Kuang, X., Zhang, M., Li, H., Zhao, G., Cao, H., Wu, Z., Wang, X.: Deepwaf: detecting web attacks based on cnn and lstm models. In: International Symposium on Cyberspace Safety and Security, pp. 121–136. Springer (2019)
Yan, H., Hu, L., Xiang, X., Liu, Z., Yuan, X.: Ppcl: Privacy-preserving collaborative learning for mitigating indirect information leakage. Inform. Sci.548, 423–437 (2021)
Li, J., Hu, X., Xiong, P., Zhou, W., et al.: The dynamic privacy-preserving mechanisms for online dynamic social networks. IEEE Transactions on Knowledge and Data Engineering (2020)
Lu, Z., Liang, H., Zhao, M., Lv, Q., Liang, T., Wang, Y.: Label-only membership inference attacks on machine unlearning without dependence of posteriors. Int. J. Intell. Syst.37(11), 9242–9441 (2022)
Melis, L., Song, C., De Cristofaro, E., Shmatikov, V.: Exploiting unintended feature leakage in collaborative learning. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 691–706. IEEE (2019)
Nasr, M., Shokri, R., Houmansadr, A.: Comprehensive privacy analysis of deep learning. In: Proceedings of the 2019 IEEE Symposium on Security and Privacy (SP), pp. 1–15 (2018)
Wei, K., Li, J., Ma, C., Ding, M., Wei, S., Wu, F., Chen, G., Ranbaduge, T.: Vertical federated learning: Challenges, methodologies and experiments.arXiv:2202.04309 (2022)
Backes, M., Berrang, P., Humbert, M., Manoharan, P.: Membership privacy in microrna-based studies. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 319–330, 2016
Chen, D., Yu, N., Zhang, Y., Fritz, M.: Gan-leaks: A taxonomy of membership inference attacks against gans.arXiv:1909.03935 (2019)
Pyrgelis, A., Troncoso, C., De Cristofaro, E.: Knock knock, who’s there? membership inference on aggregate location data,arXiv:1708.06145 (2017)
Salem, A., Zhang, Y., Humbert, M., Berrang, P., Fritz, M., Backes, M.: Ml-leaks: Model and data independent membership inference attacks and defenses on machine learning models.arXiv:1806.01246 (2018)
Rassouli, B., Varasteh, M., Gunduz, D.: Privacy against inference attacks in vertical federated learning.arXiv:2207.11788 (2022)
Zhu, H., Wang, R., Jin, Y., Liang, K.: Pivodl: Privacy-preserving vertical federated learning over distributed labels. IEEE Transactions on Artificial Intelligence (2021)
Han, X., Wang, L., Wu, J.: Data valuation for vertical federated learning: An information-theoretic approach.arXiv:2112.08364 (2021)
Abadi, M., Chu, A., Goodfellow, I., McMahan, H.B., Mironov, I., Talwar, K., Zhang, L.: Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318 (2016)
Chaudhuri, K., Monteleoni, C.: Privacy-preserving logistic regression. Advances in Neural Information Processing Systems,21 (2008)
Dwork, C.: Differential privacy: A survey of results. In: International Conference on Theory and Applications of Models of Computation, pp. 1–19. Springer (2008)
Song, S., Chaudhuri, K., Sarwate, A.D.: Stochastic gradient descent with differentially private updates. In: 2013 IEEE Global Conference on Signal and Information Processing, pp. 245–248. IEEE (2013)
Giacomelli, I., Jha, S. Joye, M., Page, C.D., Yoon, K.: Privacy-preserving ridge regression with only linearly-homomorphic encryption. Cryptology ePrint Archive (2017)
Hall, R., Fienberg, S.E., Nardi, Y.: Secure multiple linear regression based on homomorphic encryption. Journal of Official Statistics27(4), 669–691 (2011)
Nikolaenko, V., Weinsberg, U., Ioannidis, S., Joye, M., Boneh, D., Taft, N.: Privacy-preserving ridge regression on hundreds of millions of records. In: 2013 IEEE Symposium on Security and Privacy, pp. 334–348. IEEE (2013)
Rivest, R.L., Adleman, L., Dertouzos, M.L., et al.: On data banks and privacy homomorphisms. Foundations of Secure Computation4(11), 169–180 (1978)
Geyer, R.C., Klein, T., Nabi, M.: Differentially private federated learning: A client level perspective.arXiv:1712.07557 (2017)
Yuan, J., Yu, S.: Privacy preserving back-propagation neural network learning made practical with cloud computing. IEEE Transactions on Parallel & Distributed Systems25(01), 212–221 (2014)
Zhang, Q., Yang, L.T., Chen, Z.: Privacy preserving deep computation model on cloud for big data feature learning. IEEE Trans. Comput.65(5), 1351–1362 (2015)
Araki, T., Furukawa, J., Lindell, Y., Nof, A., Ohara, K.: High-throughput semi-honest secure three-party computation with an honest majority. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 805–817 (2016)
Furukawa, J., Lindell, Y., Nof, A., Weinstein, O.: High-throughput secure three-party computation for malicious adversaries and an honest majority. Cryptology ePrint Archive (2016)
Mohassel, P., Rosulek, M., Zhang, Y.: Fast and secure three-party computation: The garbled circuit approach. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 591–602 (2015)
Zhao, C., Zhao, S., Zhao, M., Chen, Z., Gao, C.-Z., Li, H., Tan, Y.-a.: Secure multi-party computation: theory, practice and applications. Inform. Sci.476, 357–372 (2019)
Kilbertus, N., Gascón, A., Kusner, M., Veale, M., Gummadi, K., Weller, A.: Blind justice: Fairness with encrypted sensitive attributes. In: International Conference on Machine Learning, pp. 2630–2639, PMLR (2018)
Mohassel, P., Rindal, P.: Aby3: A mixed protocol framework for machine learning. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 35–52 (2018)
Arivazhagan, M.G., Aggarwal, V., Singh, A.K., Choudhary, S.: Federated learning with personalization layers.arXiv:1912.00818 (2019)
Jebreel, N.M., Domingo-Ferrer, J., Blanco-Justicia, A., Sanchez, D.: Enhanced security and privacy via fragmented federated learning.arXiv:2207.05978 (2022)
Li, Q., He, B., Song, D.: Model-contrastive federated learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10713–10722 (2021)
Le, P.H., Ranellucci, S., Gordon, S.D.: Two-party private set intersection with an untrusted third party. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp. 2403–2420 (2019)
Acknowledgements
This study is supported by the Foundation of National Natural Science Foundation of China (Grant No.: 62072273, 72111530206, 61962009, 61873117, 61832012, 61771231, 61771289); The Major Basic Research Project of Natural Science Foundation of Shandong Province of China (ZR2019ZD10); Natural Science Foundation of Shandong Province (ZR2019MF062); Shandong University Science and Technology Program Project (J18A326); Guangxi Key Laboratory of Cryptography and Information Security (No: GCIS202112); The Major Basic Research Project of Natural Science Foundation of Shandong Province of China (ZR2018ZC0438); Major Scientific and Technological Special Project of Guizhou Province (20183001), Foundation of Guizhou Provincial Key Laboratory of Public Big Data (No. 2019BD-KFJJ009), Talent project of Guizhou Big Data Academy. Guizhou Provincial Key Laboratory of Public Big Data. ([2018]01).
Funding
This study is supported by the Foundation of National Natural Science Foundation of China (Grant No.: 62072273, 72111530206, 61962009, 61873117, 61832012, 61771231, 61771289); The Major Basic Research Project of Natural Science Foundation of Shandong Province of China (ZR2019ZD10); Natural Science Foundation of Shandong Province (ZR2019MF062); Shandong University Science and Technology Program Project (J18A326); Guangxi Key Laboratory of Cryptography and Information Security (No: GCIS202112); The Major Basic Research Project of Natural Science Foundation of Shandong Province of China (ZR2018ZC0438); Major Scientific and Technological Special Project of Guizhou Province (20183001), Foundation of Guizhou Provincial Key Laboratory of Public Big Data (No. 2019BD-KFJJ009), Talent project of Guizhou Big Data Academy. Guizhou Provincial Key Laboratory of Public Big Data. ([2018]01).
Author information
Authors and Affiliations
School of Computer Science, Qufu Normal University, 276800, Rizhao, Shandong, China
Yilei Wang, Qingzhe Lv, Yuhong Sun, Lingkai Ran & Tao Li
Institute of Artificial Intelligence and Blockchain, Guangzhou University, 510700, Guangzhou, Guangdong, China
Yilei Wang & Huang Zhang
School of Data Science and Engeneering, East China Normal University, 200062, Shanghai, China
Minghao Zhao
State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, 550025, Guiyang, Guizhou, China
Tao Li
- Yilei Wang
You can also search for this author inPubMed Google Scholar
- Qingzhe Lv
You can also search for this author inPubMed Google Scholar
- Huang Zhang
You can also search for this author inPubMed Google Scholar
- Minghao Zhao
You can also search for this author inPubMed Google Scholar
- Yuhong Sun
You can also search for this author inPubMed Google Scholar
- Lingkai Ran
You can also search for this author inPubMed Google Scholar
- Tao Li
You can also search for this author inPubMed Google Scholar
Contributions
All authors contributed to the study conception and design. Yilei Wang put forward the main idea, Yilei Wang, Minghao Zhao and Qingzhe Lv wrote the main manuscript text, Qingzhe Lv, Huang Zhang and Yuhong Sun wrote the main experimental code, Tao Li revised the manuscript text, Lingkai Ran searched for the required literature. All authors reviewed the manuscript.
Corresponding author
Correspondence toTao Li.
Ethics declarations
Ethical Approval and Consent to Participate
The authors guarantee that this manuscript is an original work. This manuscript has not been published or presented elsewhere in part or in entirety and is not under consideration by another journal. We have read and understood your journal’s policies, and we believe that neither the manuscript nor the study violates any of these. All authors have seen and approved the final version of the submitted manuscript.
Consent for Publication
All authors have checked the manuscript and have agreed to the submission
Human and Animal Ethics
The authors declare that this study does not involve human participants or animals.
Competing Interests
The authors have no competing interests to declare that are relevant to the content of this manuscript
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection:Special Issue on Privacy and Security in Machine Learning
Guest Editors: Jin Li, Francesco Palmieri and Changyu Dong
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, Y., Lv, Q., Zhang, H.et al. Beyond model splitting: Preventing label inference attacks in vertical federated learning with dispersed training.World Wide Web26, 2691–2707 (2023). https://doi.org/10.1007/s11280-023-01159-x
Received:
Revised:
Accepted:
Published:
Issue Date:
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative