Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
Springer Nature Link
Log in

Beyond model splitting: Preventing label inference attacks in vertical federated learning with dispersed training

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Federated learning is an emerging paradigm that enables multiple organizations to jointly train a model without revealing their private data. As an important variant,vertical federated learning (VFL) deals with cases in which collaborating organizations own data of the same set of users but with disjoint features. It is generally regarded that VFL is more secure than horizontal federated learning. However, recent research (USENIX Security’22) reveals that it is still possible to conduct label inference attacks in VFL, in which attacker can acquire privately owned labels of other participants; even VFL constructed with model splitting (the kind of VFL architecture with higher security guarantee) cannot escape it. To solve this issue, in this paper, we propose the dispersed training framework. It utilizes secret sharing to break the correlations between the bottom model and the training data. Accordingly, even if the attacker receives the gradients in the training phase, he is incapable to deduce the feature representation of labels from the bottom model. Besides, we design a customized model aggregation method such that the shared model can be privately combined, and the linearity of secret sharing schemes ensures the training accuracy to be preserved. Theoretical and experimental analyses indicate the satisfactory performance and effectiveness of our framework.

This is a preview of subscription content,log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Figure 1
Algorithm 1
Figure 2
Figure 3
Figure 4

Similar content being viewed by others

Availability of Supporting Data

The data used to support the findings of this study are available from the second author upon request.

References

  1. Voigt, P., Von dem Bussche, A.: The EU general data protection regulation (GDPR. A Practical Guide, 1st Ed., Cham: Springer International Publishing10(3152676), 10–5555 (2017)

  2. Hoofnagle, C.J., van der Sloot, B., Borgesius, F.Z.: The european union general data protection regulation: what it is and what it means. Inform. Commun. Technol. Law28(1), 65–98 (2019)

    Article  Google Scholar 

  3. Chik, W.B.: The singapore personal data protection act and an assessment of future trends in data privacy reform. Comput. Law Secur. Rev.29(5), 554–575 (2013)

    Article  Google Scholar 

  4. Shatz, S., Chylik, S.E.: The california consumer privacy act of 2018: A sea change in the protection of california consumers. The Business Lawyer75 (2020)

  5. Hu, H., Salcic, Z., Sun, L., Dobbie, G., Yu, P.S., Zhang, X.: Membership inference attacks on machine learning: A survey. ACM Comput. Surv. (CSUR)54(11s), 1–37 (2022)

    Article  Google Scholar 

  6. Fu, C., Zhang, X., Ji, S., Chen, J., Wu, J., Guo, S., Zhou, J., Liu, A.X. Wang, T.: Label inference attacks against vertical federated learning. In: 31st USENIX Security Symposium (USENIX Security 22), Boston, MA (2022)

  7. McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282, PMLR (2017)

  8. Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: Concept and applications. ACM Trans. Intell. Syst. Technol. (TIST)10(2), 1–19 (2019)

    Article  Google Scholar 

  9. Liu, Y., Kang, Y., Xing, C., Chen, T., Yang, Q.: A secure federated transfer learning framework. IEEE Intell. Syst.35(4), 70–82 (2020)

    Article  Google Scholar 

  10. Vepakomma, P., Gupta, O., Swedish, T., Raskar, R.: Split learning for health: Distributed deep learning without sharing raw patient data,arXiv:1812.00564 (2018)

  11. Van der Maaten, L., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res.9(11) (2008)

  12. Yuan, F., Chen, S., Liang, K., Xu, L.: Research on the coordination mechanism of traditional Chinese medicine medical record data standardization and characteristic protection under big data environment, vol. 1 of1. No.517 Shungong Road, Shizhong District, Jinan, Shandong Province, China: Shandong:Shandong People’s Publishing House, 1 ed., (2021)

  13. Chen, C., Huang, T.: Camdar-adv: generating adversarial patches on 3d object. Int. J. Intell. Syst.36(3), 1441–1453 (2021)

    Article MathSciNet  Google Scholar 

  14. Jiang, N., Jie, W., Li, J., Liu, X., Jin, D.: Gatrust: A multi-aspect graph attention network model for trust assessment in osns. IEEE Transactions on Knowledge and Data Engineering (2022)

  15. Yan, H., Chen, M., Hu, L., Jia, C.: Secure video retrieval using image query on an untrusted cloud. Appl. Soft Comput.97, 106782 (2020)

    Article  Google Scholar 

  16. Ai, S., Hong, S., Zheng, X., Wang, Y., Liu, X.: Csrt rumor spreading model based on complex network. Int. J. Intell. Syst.36(5), 1903–1913 (2021)

    Article  Google Scholar 

  17. Li, T., Wang, Z., Chen, Y., Li, C., Jia, Y., Yang, Y.: Is semi-selfish mining available without being detected? Int. J. Intell. Syst. (2021).https://doi.org/10.1002/int.22656

    Article  Google Scholar 

  18. Li, T., Wang, Z., Chen, Y., Li, C., Jia, Y., Yang, Y.: Is semi-selfish mining available without being detected?. International Journal of Intelligent Systems (2021)

  19. Zhang, X., Wang, T.: Elastic and reliable bandwidth reservation based on distributed traffic monitoring and control. IEEE Transactions on Parallel and Distributed Systems (2022)

  20. Zhang, X., Wang, Y., Geng, G., Yu, J., Delay-optimized multicast tree packing in software-defined networks. IEEE Transactions on Services Computing (2021)

  21. Konečnỳ, J., McMahan, H.B., Ramage, D., Richtárik, P.: Federated optimization: Distributed machine learning for on-device intelligence.arXiv:1610.02527 (2016)

  22. Konečnỳ, J., McMahan, H.B., Yu, F.X., Richtárik, P., Suresh, A.T., Bacon, D.: Federated learning: Strategies for improving communication efficiency.arXiv:1610.05492 (2016)

  23. McMahan, H.B., Moore, E., Ramage, D., y Arcas, B. A.: Federated learning of deep networks using model averaging. vol. 2,arXiv:1602.05629 (2016)

  24. Bonawitz, K., Ivanov, V., Kreuter, B., Marcedone, A., McMahan, H.B., Patel, S., Ramage, D., Segal, A., Seth, K.: Practical secure aggregation for privacy-preserving machine learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1175–1191 (2017)

  25. Shokri, R., Shmatikov, V.: Privacy-preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1310–1321 (2015)

  26. Du, W., Atallah, M.J.: Privacy-preserving cooperative statistical analysis. In: Seventeenth Annual Computer Security Applications Conference, pp. 102–110. IEEE (2001)

  27. Du, W., Han, Y.S., Chen, S.: Privacy-preserving multivariate statistical analysis: Linear regression and classification. In: Proceedings of the 2004 SIAM International Conference on Data Mining, pp. 222–233. SIAM (2004)

  28. Sanil, A.P., Karr, A.F., Lin, X., Reiter, J.P.: Privacy preserving regression modelling via distributed computation. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 677–682 (2004)

  29. Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 639–644 (2002)

  30. Wan, L., Ng, W.K., Han, S., Lee, V.C.: Privacy-preservation for gradient descent methods. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 775–783 (2007)

  31. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowledge Data Eng22(10), 1345–1359 (2009)

    Article  Google Scholar 

  32. Tianqing, Z., Zhou, W., Ye, D., Cheng, Z., Li, J.: Resource allocation in iot edge computing via concurrent federated reinforcement learning. IEEE Internet of Things Journal9(2), 1414–1426 (2021)

    Article  Google Scholar 

  33. Hu, L., Yan, H., Li, L., Pan, Z., Liu, X., Zhang, Z.: Mhat: an efficient model-heterogenous aggregation training scheme for federated learning. Inform. Sci.560, 493–503 (2021)

    Article MathSciNet  Google Scholar 

  34. Mo, K., Huang, T., Xiang, X.: Querying little is enough: Model inversion attack via latent information. In: International Conference on Machine Learning for Cyber Security, pp. 583–591. Springer (2020)

  35. Ren, H., Huang, T., Yan, H.: Adversarial examples: attacks and defenses in the physical world. International Journal of Machine Learning and Cybernetics12(11), 3325–3336 (2021)

    Article  Google Scholar 

  36. Kuang, X., Zhang, M., Li, H., Zhao, G., Cao, H., Wu, Z., Wang, X.: Deepwaf: detecting web attacks based on cnn and lstm models. In: International Symposium on Cyberspace Safety and Security, pp. 121–136. Springer (2019)

  37. Yan, H., Hu, L., Xiang, X., Liu, Z., Yuan, X.: Ppcl: Privacy-preserving collaborative learning for mitigating indirect information leakage. Inform. Sci.548, 423–437 (2021)

    Article MathSciNet  Google Scholar 

  38. Li, J., Hu, X., Xiong, P., Zhou, W., et al.: The dynamic privacy-preserving mechanisms for online dynamic social networks. IEEE Transactions on Knowledge and Data Engineering (2020)

  39. Lu, Z., Liang, H., Zhao, M., Lv, Q., Liang, T., Wang, Y.: Label-only membership inference attacks on machine unlearning without dependence of posteriors. Int. J. Intell. Syst.37(11), 9242–9441 (2022)

    Article  Google Scholar 

  40. Melis, L., Song, C., De Cristofaro, E., Shmatikov, V.: Exploiting unintended feature leakage in collaborative learning. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 691–706. IEEE (2019)

  41. Nasr, M., Shokri, R., Houmansadr, A.: Comprehensive privacy analysis of deep learning. In: Proceedings of the 2019 IEEE Symposium on Security and Privacy (SP), pp. 1–15 (2018)

  42. Wei, K., Li, J., Ma, C., Ding, M., Wei, S., Wu, F., Chen, G., Ranbaduge, T.: Vertical federated learning: Challenges, methodologies and experiments.arXiv:2202.04309 (2022)

  43. Backes, M., Berrang, P., Humbert, M., Manoharan, P.: Membership privacy in microrna-based studies. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 319–330, 2016

  44. Chen, D., Yu, N., Zhang, Y., Fritz, M.: Gan-leaks: A taxonomy of membership inference attacks against gans.arXiv:1909.03935 (2019)

  45. Pyrgelis, A., Troncoso, C., De Cristofaro, E.: Knock knock, who’s there? membership inference on aggregate location data,arXiv:1708.06145 (2017)

  46. Salem, A., Zhang, Y., Humbert, M., Berrang, P., Fritz, M., Backes, M.: Ml-leaks: Model and data independent membership inference attacks and defenses on machine learning models.arXiv:1806.01246 (2018)

  47. Rassouli, B., Varasteh, M., Gunduz, D.: Privacy against inference attacks in vertical federated learning.arXiv:2207.11788 (2022)

  48. Zhu, H., Wang, R., Jin, Y., Liang, K.: Pivodl: Privacy-preserving vertical federated learning over distributed labels. IEEE Transactions on Artificial Intelligence (2021)

  49. Han, X., Wang, L., Wu, J.: Data valuation for vertical federated learning: An information-theoretic approach.arXiv:2112.08364 (2021)

  50. Abadi, M., Chu, A., Goodfellow, I., McMahan, H.B., Mironov, I., Talwar, K., Zhang, L.: Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318 (2016)

  51. Chaudhuri, K., Monteleoni, C.: Privacy-preserving logistic regression. Advances in Neural Information Processing Systems,21 (2008)

  52. Dwork, C.: Differential privacy: A survey of results. In: International Conference on Theory and Applications of Models of Computation, pp. 1–19. Springer (2008)

  53. Song, S., Chaudhuri, K., Sarwate, A.D.: Stochastic gradient descent with differentially private updates. In: 2013 IEEE Global Conference on Signal and Information Processing, pp. 245–248. IEEE (2013)

  54. Giacomelli, I., Jha, S. Joye, M., Page, C.D., Yoon, K.: Privacy-preserving ridge regression with only linearly-homomorphic encryption. Cryptology ePrint Archive (2017)

  55. Hall, R., Fienberg, S.E., Nardi, Y.: Secure multiple linear regression based on homomorphic encryption. Journal of Official Statistics27(4), 669–691 (2011)

    Google Scholar 

  56. Nikolaenko, V., Weinsberg, U., Ioannidis, S., Joye, M., Boneh, D., Taft, N.: Privacy-preserving ridge regression on hundreds of millions of records. In: 2013 IEEE Symposium on Security and Privacy, pp. 334–348. IEEE (2013)

  57. Rivest, R.L., Adleman, L., Dertouzos, M.L., et al.: On data banks and privacy homomorphisms. Foundations of Secure Computation4(11), 169–180 (1978)

    MathSciNet  Google Scholar 

  58. Geyer, R.C., Klein, T., Nabi, M.: Differentially private federated learning: A client level perspective.arXiv:1712.07557 (2017)

  59. Yuan, J., Yu, S.: Privacy preserving back-propagation neural network learning made practical with cloud computing. IEEE Transactions on Parallel & Distributed Systems25(01), 212–221 (2014)

    Article  Google Scholar 

  60. Zhang, Q., Yang, L.T., Chen, Z.: Privacy preserving deep computation model on cloud for big data feature learning. IEEE Trans. Comput.65(5), 1351–1362 (2015)

    Article MathSciNet MATH  Google Scholar 

  61. Araki, T., Furukawa, J., Lindell, Y., Nof, A., Ohara, K.: High-throughput semi-honest secure three-party computation with an honest majority. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 805–817 (2016)

  62. Furukawa, J., Lindell, Y., Nof, A., Weinstein, O.: High-throughput secure three-party computation for malicious adversaries and an honest majority. Cryptology ePrint Archive (2016)

  63. Mohassel, P., Rosulek, M., Zhang, Y.: Fast and secure three-party computation: The garbled circuit approach. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 591–602 (2015)

  64. Zhao, C., Zhao, S., Zhao, M., Chen, Z., Gao, C.-Z., Li, H., Tan, Y.-a.: Secure multi-party computation: theory, practice and applications. Inform. Sci.476, 357–372 (2019)

  65. Kilbertus, N., Gascón, A., Kusner, M., Veale, M., Gummadi, K., Weller, A.: Blind justice: Fairness with encrypted sensitive attributes. In: International Conference on Machine Learning, pp. 2630–2639, PMLR (2018)

  66. Mohassel, P., Rindal, P.: Aby3: A mixed protocol framework for machine learning. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 35–52 (2018)

  67. Arivazhagan, M.G., Aggarwal, V., Singh, A.K., Choudhary, S.: Federated learning with personalization layers.arXiv:1912.00818 (2019)

  68. Jebreel, N.M., Domingo-Ferrer, J., Blanco-Justicia, A., Sanchez, D.: Enhanced security and privacy via fragmented federated learning.arXiv:2207.05978 (2022)

  69. Li, Q., He, B., Song, D.: Model-contrastive federated learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10713–10722 (2021)

  70. Le, P.H., Ranellucci, S., Gordon, S.D.: Two-party private set intersection with an untrusted third party. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp. 2403–2420 (2019)

Download references

Acknowledgements

This study is supported by the Foundation of National Natural Science Foundation of China (Grant No.: 62072273, 72111530206, 61962009, 61873117, 61832012, 61771231, 61771289); The Major Basic Research Project of Natural Science Foundation of Shandong Province of China (ZR2019ZD10); Natural Science Foundation of Shandong Province (ZR2019MF062); Shandong University Science and Technology Program Project (J18A326); Guangxi Key Laboratory of Cryptography and Information Security (No: GCIS202112); The Major Basic Research Project of Natural Science Foundation of Shandong Province of China (ZR2018ZC0438); Major Scientific and Technological Special Project of Guizhou Province (20183001), Foundation of Guizhou Provincial Key Laboratory of Public Big Data (No. 2019BD-KFJJ009), Talent project of Guizhou Big Data Academy. Guizhou Provincial Key Laboratory of Public Big Data. ([2018]01).

Funding

This study is supported by the Foundation of National Natural Science Foundation of China (Grant No.: 62072273, 72111530206, 61962009, 61873117, 61832012, 61771231, 61771289); The Major Basic Research Project of Natural Science Foundation of Shandong Province of China (ZR2019ZD10); Natural Science Foundation of Shandong Province (ZR2019MF062); Shandong University Science and Technology Program Project (J18A326); Guangxi Key Laboratory of Cryptography and Information Security (No: GCIS202112); The Major Basic Research Project of Natural Science Foundation of Shandong Province of China (ZR2018ZC0438); Major Scientific and Technological Special Project of Guizhou Province (20183001), Foundation of Guizhou Provincial Key Laboratory of Public Big Data (No. 2019BD-KFJJ009), Talent project of Guizhou Big Data Academy. Guizhou Provincial Key Laboratory of Public Big Data. ([2018]01).

Author information

Authors and Affiliations

  1. School of Computer Science, Qufu Normal University, 276800, Rizhao, Shandong, China

    Yilei Wang, Qingzhe Lv, Yuhong Sun, Lingkai Ran & Tao Li

  2. Institute of Artificial Intelligence and Blockchain, Guangzhou University, 510700, Guangzhou, Guangdong, China

    Yilei Wang & Huang Zhang

  3. School of Data Science and Engeneering, East China Normal University, 200062, Shanghai, China

    Minghao Zhao

  4. State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, 550025, Guiyang, Guizhou, China

    Tao Li

Authors
  1. Yilei Wang

    You can also search for this author inPubMed Google Scholar

  2. Qingzhe Lv

    You can also search for this author inPubMed Google Scholar

  3. Huang Zhang

    You can also search for this author inPubMed Google Scholar

  4. Minghao Zhao

    You can also search for this author inPubMed Google Scholar

  5. Yuhong Sun

    You can also search for this author inPubMed Google Scholar

  6. Lingkai Ran

    You can also search for this author inPubMed Google Scholar

  7. Tao Li

    You can also search for this author inPubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Yilei Wang put forward the main idea, Yilei Wang, Minghao Zhao and Qingzhe Lv wrote the main manuscript text, Qingzhe Lv, Huang Zhang and Yuhong Sun wrote the main experimental code, Tao Li revised the manuscript text, Lingkai Ran searched for the required literature. All authors reviewed the manuscript.

Corresponding author

Correspondence toTao Li.

Ethics declarations

Ethical Approval and Consent to Participate

The authors guarantee that this manuscript is an original work. This manuscript has not been published or presented elsewhere in part or in entirety and is not under consideration by another journal. We have read and understood your journal’s policies, and we believe that neither the manuscript nor the study violates any of these. All authors have seen and approved the final version of the submitted manuscript.

Consent for Publication

All authors have checked the manuscript and have agreed to the submission

Human and Animal Ethics

The authors declare that this study does not involve human participants or animals.

Competing Interests

The authors have no competing interests to declare that are relevant to the content of this manuscript

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection:Special Issue on Privacy and Security in Machine Learning

Guest Editors: Jin Li, Francesco Palmieri and Changyu Dong

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Lv, Q., Zhang, H.et al. Beyond model splitting: Preventing label inference attacks in vertical federated learning with dispersed training.World Wide Web26, 2691–2707 (2023). https://doi.org/10.1007/s11280-023-01159-x

Download citation

Keywords

Associated Content

Part of a collection:

Special Issue on Privacy and Security in Machine Learning

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Advertisement


[8]ページ先頭

©2009-2025 Movatter.jp