Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Anonymization library for python. Protect the privacy of individuals.

License

NotificationsYou must be signed in to change notification settings

glassonion1/anonypy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Anonymization library for python.AnonyPy provides following privacy preserving techniques for the anonymization.

  • K Anonymity
  • L Diversity
  • T Closeness

The Anonymization method

  • Anonymization method aims at making the individual record be indistinguishable among a group record by using techniques of generalization and suppression.
  • Turning a dataset into a k-anonymous (and possibly l-diverse or t-close) dataset is a complex problem, and finding the optimal partition into k-anonymous groups is an NP-hard problem.
  • AnonyPy uses "Mondrian" algorithm to partition the original data into smaller and smaller groups
  • The algorithm assumes that we have converted all attributes into numerical or categorical values and that we are able to measure the “span” of a given attribute Xi.

Install

$ pip install anonypy

Usage

importanonypyimportpandasaspddata= [    [6,"1","test1","x",20],    [6,"1","test1","x",30],    [8,"2","test2","x",50],    [8,"2","test3","w",45],    [8,"1","test2","y",35],    [4,"2","test3","y",20],    [4,"1","test3","y",20],    [2,"1","test3","z",22],    [2,"2","test3","y",32],]columns= ["col1","col2","col3","col4","col5"]categorical=set(("col2","col3","col4"))df=pd.DataFrame(data=data,columns=columns)fornameincategorical:df[name]=df[name].astype("category")feature_columns= ["col1","col2","col3"]sensitive_column="col4"p=anonypy.Preserver(df,feature_columns,sensitive_column)rows=p.anonymize_k_anonymity(k=2)dfn=pd.DataFrame(rows)print(dfn)

Original data

   col1 col2   col3 col4  col50     6    1  test1    x    201     6    1  test1    x    302     8    2  test2    x    503     8    2  test3    w    454     8    1  test2    y    355     4    2  test3    y    206     4    1  test3    y    207     2    1  test3    z    228     2    2  test3    y    32

The created anonymized data is below(Guarantee 2-anonymity).

  col1 col2         col3 col4  count0  2-4    2        test3    y      21  2-4    1        test3    y      12  2-4    1        test3    z      13  6-8    1  test1,test2    x      24  6-8    1  test1,test2    y      15    8    2  test3,test2    w      16    8    2  test3,test2    x      1

Publish PyPI

$ python -m pip install hatchling wheel twine$ python -m build --wheel .$ python -m twine upload dist/*

About

Anonymization library for python. Protect the privacy of individuals.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors3

  •  
  •  
  •  

Languages


[8]ページ先頭

©2009-2025 Movatter.jp