Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Transform JSON output from Senzing SDK for use with graph technologies, semantics, and downstream LLM integration

License

NotificationsYou must be signed in to change notification settings

senzing-garage/sz-semantics

If you are beginning your journey with [Senzing],please start with [Senzing Quick Start guides].

You are in the [Senzing Garage] where projects are "tinkered" on.Although this GitHub repository may help you understand an approach to using Senzing,it's not considered to be "production ready" and is not considered to be part of the Senzing product.Heck, it may not even be appropriate for your application of Senzing!

Transform JSON output from theSenzing SDKfor use with graph technologies, semantics, and downstream LLM integration.

Install

This library usespoetry for demos:

poetry update

Otherwise, to use the library:

pip install sz_semantics

For thegRCP server, if you don't already have Senzing and its gRPC server otherwiseinstalled pull the latest Docker container:

docker pull senzing/serve-grpc:latest

Usage: Masking PII

Mask the PII values within Senzing JSON output with tokens which canbe substituted back later. For example,mask PII values beforecalling a remote service (such as an LLM-based chat) thenunmaskreturned text after the roundtrip, to maintaindata privacy.

importjsonfromsz_semanticsimportMaskdata:dict= {"ENTITY_NAME":"Robert Smith" }sz_mask:Mask=Mask()masked_data:dict=sz_mask.mask_data(data)masked_text:str=json.dumps(masked_data)print(masked_text)unmasked:str=sz_mask.unmask_text(masked_text)print(unmasked)

For an example, run thedemo1.py script with a data file whichcaptures Senzing JSON output:

poetry run python3 demo1.py data/get.json

The two listsMask.KNOWN_KEYS andMask.MASKED_KEYS enumeraterespectively the:

  • keys for known elements which do not require masking
  • keys for PII elements which require masking

Any other keys encountered will be masked by default and reported aswarnings in the logging. Adjust these lists as needed for a given usecase.

For work with large numbers of entities, subclassKeyValueStore toprovide a distributed key-value store (other than the Python built-indict default) to use for scale-out.

Usage: gRPC Client/Server

To useSzClient to simplify access to the Senzing SDK, first launchtheserve-grpc container and run it in the background:

docker run -it --publish 8261:8261 --rm senzing/serve-grpc

For example code which runsentity resolution on the "truthset"collection of datasets:

importpathlibimporttomllibfromsz_semanticsimportSzClientwithopen(pathlib.Path("config.toml"),mode="rb")asfp:config:dict=tomllib.load(fp)data_sources:typing.Dict[str,str ]= {"CUSTOMERS":"data/truth/customers.json","WATCHLIST":"data/truth/watchlist.json","REFERENCE":"data/truth/reference.json",}sz:SzClient=SzClient(config,data_sources)sz.entity_resolution(data_sources)forent_jsoninsz.sz_engine.export_json_entity_report_iterator():print(ent_json)

For a demo of running entity resolution on the "truthset", run thedemo2.py script:

poetry run python3 demo2.py

This produces theexport.json file which is JSONL representing theresults of a "get entity" call on each resolved entity.

Note: to show the redo processing, be sure to restart the containereach time before re-running thedemo2.py script -- although theentity resolution results will be the same even without a containerrestart.

Usage: Semantic Representation

Starting with a smallSKOS-based taxonomyin thedomain.ttl file, parse the Senzingentity resolution (ER) results to generate anRDFlibsemantic graph.

In other words, generate the "backbone" for constructing anEntity Resolved Knowledge Graph, as a core component of asemantic layer.

The example code below serializes thethesaurus generated fromSenzing ER results as"thesaurus.ttl" combined with the Senzingtaxonomy definitions, which can be used for constructing knowledgegraphs:

importpathlibfromsz_semanticsimportThesaurusthesaurus:Thesaurus=Thesaurus()thesaurus.load_source(Thesaurus.DOMAIN_TTL)export_path:pathlib.Path=pathlib.Path("data/truth/export.json")withopen(export_path,"r",encoding="utf-8")asfp_json:forlineinfp_json:forrdf_fraginthesaurus.parse_iter(line,language="en"):thesaurus.load_source_text(Thesaurus.RDF_PREAMBLE+rdf_frag,format="turtle",            )thesaurus_path:pathlib.Path=pathlib.Path("thesaurus.ttl")thesaurus.save_source(thesaurus_path,format="turtle")

For an example, run thedemo3.py script to process the JSON filedata/truth/export.json which captures Senzing ER exported results:

poetry run python3 demo3.py data/truth/export.json

Check the resulting RDF definitions in the generatedthesaurus.ttlfile.


mask


License and Copyright

Source code forsz_semantics plus any logo, documentation, andexamples have anApache licensewhich is succinct and simplifies use in commercial applications.

All materials herein are Copyright © 2025 Senzing, Inc.

Kudos to@brianmacy,@jbutcher21,@docktermj,@cj2001,@jesstalisman-ia,and the kind folks atGraphGeeks for their support.

Star History

Star History Chart

About

Transform JSON output from Senzing SDK for use with graph technologies, semantics, and downstream LLM integration

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Contributors3

  •  
  •  
  •  

Languages


[8]ページ先頭

©2009-2025 Movatter.jp