Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
Springer Nature Link
Log in

Privacy-Preserving Data Sharing by Integrating Perturbed Distance Matrices

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Collecting large amounts of data is beneficial in machine learning to generate models that are less biased. There are many cases in which pieces of similar data are distributed among organizations, and it is difficult to integrate these data owing to issues involving privacy and cost. Integrating these distributed data without delivering the original data leads to the concept of data collaboration, which combines data held by different organizations in a secure manner. We propose a method in which a distance matrix of the original data obtained using common data among organizations is shared to learn neighbor information of the original data. Specifically, the proposed method robustly integrates distributed data, which is of as good quality as connected raw data, in cases where the amount of data in each organization is small and the data bias is large. In addition, the proposed method is applicable to data contaminated by noise. To demonstrate the effectiveness of the proposed method, we performed a classification task on open biological data divided into several pieces and found that the classification results for divided data were as precise as when all data were available. Finally, we show that the robustness of the method against noise improves the anonymity of the original data as a by-product.

This is a preview of subscription content,log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

  1. Aggarwal CC, Philip SY. A general survey of privacy-preserving data mining models and algorithms. In: Yin Y, Kaku I, Tang J, Zhu JM, editors. Privacy-preserving data mining. New York: Springer; 2008. p. 11–52.

    Chapter  Google Scholar 

  2. Agrawal R, Srikant R. Privacy-preserving data mining. In: ACM Sigmod Record, vol. 29. New York: ACM; 2000. p. 439–50.

  3. Bonawitz K, Eichner H, Grieskamp W, Huba D, Ingerman A, Ivanov V, Kiddon C, Konecný J, Mazzocchi S, McMahan HB, Overveldt TV, Petrou D, Ramage D, Roselander J. Towards federated learning at scale: system design. 2019.arXiv:1902.01046.

  4. Cai H, Zheng VW, Chang KC. A comprehensive survey of graph embedding: problems, techniques and applications. 2017.arXiv:1709.07604.

  5. Chida K, Morohashi G, Fuji H, Magata F, Fujimura A, Hamada K, Ikarashi D, Yamamoto R. Implementation and evaluation of an efficient secure computation system using ‘R’ for healthcare statistics. J Am Med Inf Assoc. 2014;21(e2):e326–31.

    Article  Google Scholar 

  6. Cui P, Wang X, Pei J, Zhu W. A survey on network embedding. 2017.arXiv:1711.08752.

  7. Cunningham JP, Ghahramani Z. Linear dimensionality reduction: survey, insights, and generalizations. J Mach Learn Res. 2015;16:2859–900.

    MathSciNet MATH  Google Scholar 

  8. Du W, Atallah MJ. Secure multi-party computation problems and their applications: a review and open problems. In: Proceedings of the 2001 workshop on New security paradigms. ACM; 2001. p. 13–22.

  9. Dua D, Graff C. UCI machine learning repository. 2017.http://archive.ics.uci.edu/ml.

  10. Goyal P, Ferrara E. Graph embedding techniques, applications, and performance: a survey. 2017. CoRRarXiv:1705.02801.

  11. Grover A, Leskovec J. Node2vec: scalable feature learning for networks. In: Proceedings of the 22Nd ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’16. New York: ACM; 2016. p. 855–64.https://doi.org/10.1145/2939672.2939754.

  12. He X. Locality preserving projections. Ph.D. thesis, Chicago, IL, USA. 2005. AAI3195015.

  13. Imakura A, Sakurai T. Data collaboration analysis framework using centralization of individual intermediate representations for distributed data sets. ASCE ASME J Risk Uncertain Eng Syst A Civ Eng. 2020;6(2):04020018.

  14. Konečný J, McMahan HB, Yu FX, Richtarik P, Suresh AT, Bacon D. Federated learning: Strategies for improving communication efficiency. In: NIPS workshop on private multi-party machine learning. 2016.arXiv:1610.05492.

  15. McMahan HB, Moore E, Ramage D, Hampson S, Arcas BA. Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th international conference on artificial intelligence and statistics (AISTATS). 2017.arXiv:1602.05629.

  16. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, editors. Advances in neural information processing systems, vol. 26. Red Hook: Curran Associates Inc; 2013. p. 3111–9.

    Google Scholar 

  17. Nikolaenko V, Weinsberg U, Ioannidis S, Joye M, Boneh D, Taft N. Privacy-preserving ridge regression on hundreds of millions of records. In: 2013 IEEE symposium on security and privacy. IEEE; 2013. p. 334–48.

  18. Perozzi B, Al-Rfou R, Skiena S. Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’14. New York: ACM; 2014. p. 701–10.https://doi.org/10.1145/2623330.2623732.

  19. Roweis ST, Saul LK. Nonlinear dimensionality reduction by locally linear embedding. Science. 2000;290:2323–6.

    Article  Google Scholar 

  20. Sweeney L. k-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst. 2002;10(05):557–70.

    Article MathSciNet  Google Scholar 

  21. Wagner I, Eckhoff D. Technical privacy metrics: a systematic survey. ACM Comput Surv CSUR. 2018;51(3):57.

    Google Scholar 

  22. Yao ACC. How to generate and exchange secrets. In: 27th annual symposium on foundations of computer science (SFCS 1986). IEEE; 1986. p. 162–7

Download references

Acknowledgements

The present study was supported in part by the New Energy and Industrial Technology Development Organization (NEDO) and by the Japan Society for the Promotion of Science (JSPS), Grants-in-Aid for Scientific Research Nos. 19K12198, 17H03280 and JST MIRAI JPMJMI19B.

Author information

Authors and Affiliations

  1. Division of Policy and Planning Sciences, Faculty of Engineering, Information and Systems, University of Tsukuba, 1-1-1 Tennoudai, Tsukuba, Ibaraki, Japan

    Hanten Chang

  2. Faculty of Engineering, Information and Systems, University of Tsukuba, 1-1-1 Tennoudai, Tsukuba, Ibaraki, Japan

    Hiroyasu Ando

Authors
  1. Hanten Chang
  2. Hiroyasu Ando

Corresponding author

Correspondence toHanten Chang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Privacy, Data Protection and Digital Identity” guest edited by Fernando Boavida, Andrea Praitano and Georgios V. Lioudakis.

Rights and permissions

About this article

Associated Content

Part of a collection:

Privacy, Data Protection and Digital Identity

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Advertisement


[8]ページ先頭

©2009-2025 Movatter.jp