This repository was archived by the owner on Nov 30, 2022. It is now read-only.

babylonhealth/hmrbPublic archive

NotificationsYou must be signed in to change notification settings
Fork6
Star70

License

Apache-2.0 license

70 stars 6 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
docs		docs
examples		examples
hmrb		hmrb
tests		tests
.codeclimate.yml		.codeclimate.yml
.coveragerc		.coveragerc
.flake8		.flake8
.gitignore		.gitignore
CODEOWNERS		CODEOWNERS
CONTRIBUTING		CONTRIBUTING
HISTORY.rst		HISTORY.rst
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
NOTICE		NOTICE
README.md		README.md
doc_requirements.txt		doc_requirements.txt
mypy.ini		mypy.ini
noxfile.py		noxfile.py
pytest.ini		pytest.ini
requirements.in		requirements.in
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
test_requirements.txt		test_requirements.txt

Repository files navigation

NoteThis repository is no longer actively maintained by Babylon Health. For any issues and further releases please check bodak/hmrb.

Hammurabi [hmrb] 🏺

Upholds the law for sequences.

1. Installation

To begin, simply install the package from PyPI:

$ pip install hmrb

2. Documentation

Documentation is available athttps://hmrb.readthedocs.io.Instructions to build and run locally:

$ pip install -r doc_requirements.txt$ pip install -e.$ make docs$ make html

3. Definitions

Hammurabi works as a rule engine to parse input using a defined set of rules.It uses a simple and readable syntax to define complex rules to handle phrase matching.

The engine takes as input any type of sequences of units with associated attributes.Our usecase currently is to handle language annotation, but we expect it to workequally well on a variety of complex sequence tasks (time-series, logging).

The attributes do not have to be consistent across all units or between theunits and the grammar. The lack of an attribute is simply considered as anon-match.

Features:

Attribute level rule definitions using key-values pairs
Efficient matching of sequence using hash tables with no limit on length
Support for nested boolean expressions and wildcard operators similar to regular expressions
Variables can be side-loaded and reused throughout different rule sets
User-defined rule-level callback functions triggered by a match
Labels to tag and retrieve matched sequence segments

3.1 Writing Rules

Rules are defined in a custom syntax. The syntax was definedwith the aim to keep it simple to read, but expressive at the same time.

The basic components areLaw andVar. BothLaw andVar declare a sequence of attributes.However, while aLaw can be matched on its own, aVar defines a sequence that is likely to be reused (a.k.a macros) withinLaws or otherVars. Since aVar is never matched on its own, it requires a name and only exists as part of a rule body.

The example below shows a fictional case of capturing strings such as"head is hurting" or"head hurts".Note that the variableis_hurting cannot matchis hurting.

Var is_hurting:(    optional (lemma: "be")    (lemma: "hurt"))Law:    - package: "headache"    - callback: "mark_headache"    - junk_attribute: "some string"(    (lemma: "head", pos: "NOUN")    $is_hurting)

3.2 Input format

Hammurabi requires a sequence of attribute dictionaries as input.It will attempt to find matching rules in the given input.The most widely-used input format is a simple JSON list of dictionaries:

[    {"orth":"My","lemma":"my","pos":"PRON"},    {"orth":"head","lemma":"head","pos":"NOUN"},    {"orth":"hurts","lemma":"hurt","pos":"VERB"}]

3.3 Callbacks, labels and data

When a rule matches an input, the following information is returned as a"match": the original input, a slice representing the span it was triggered onand all the data (labels, callback function and attributes) based onthe matched rule. There are two ways to act upon these matches.You can use delegate the execution of the callback function tohammurabior you can do the execution yourself. The former is done by passing the inputto the__call__ method, which executes callback functions right afterthe matches are returned. However, this has a slight drawback, which is thatyour callback functions need to adhere to a specific signature to allow themto be called correctly from insidehammurabi.

# callback function called from inside hammurabidefmark_headache(input_,slice_,data):print(f'I am acting on span "{input_[slice_]}" with data "{data}".')

The callback functions are passed down as a mapping between their string aliasused in the rule grammar, i.e. how do you refer to it in thecallbackattribute of the law that was matched.

callbacks= {'mark_headache':mark_headache}

4. Usage

4.1 Worked-out example with callbacks

The rule engine is initialized through aCore instance. We can pass various optionalobjects to the constructor ofCore (callbacks, sets) that we intend to later use in our rules.

TheCore.load method adds rules to the engine.It is possible to load multiple rule files sequentially.

TheCore library usage pattern allows the user to either get thematches and act on them in a different place through the use of thematchmethod, or to pass a callback mapping and allowhammurabi to execute thecallbacks through the use of the__call__ method.

grammar="""Var is_hurting:(    optional (lemma: "be")    (lemma: "hurt"))Law:    - package: "headache"    - callback: "mark_headache"    - junk_attribute: "some string"(    (lemma: "head", pos: "NOUN")    $is_hurting)"""input_= [    {"orth":"My","lemma":"my","pos":"PRON"},    {"orth":"head","lemma":"head","pos":"NOUN"},    {"orth":"hurts","lemma":"hurt","pos":"VERB"},]# Library use casefromhmrb.coreimportCorespans= [(start,input_[start:])forstartinrange(len(input_))]hmb_ext=Core()hmb_ext.load(grammar)# external executionforspan,datainhmb_ext._match(spans):print("External execution!!!")slice_=slice(span[0],span[1])callbacks[data[0]["callback"]](input_,slice_,data)# External execution!!!# I am acting on span "head hurts" with data# "{#      'package': 'headache',#      'callback': 'mark_headache',#      'junk_attribute': 'some string'# }"# internal executionhmb_int=Core(callbacks={"mark_headache":mark_headache})hmb_int.load(grammar)hmb_int(input_)#  I am acting on span "head hurts" with data#  "{#       'package': 'headache',#       'callback': 'mark_headache',#       'junk_attribute': 'pointless strings I am passing down because I can'#  }"

You can find this worked out example underexamples/readme.py.

4.2 spaCy component example (NLP)

The spaCy component classSpacyCore extends the internal execution shownabove to allow the use ofhammurabi in spaCy natural language processingpipelines. Optionally a function (jsonify) can be passed into the SpacyCoreto convert theToken objects to JSON.

importspacyfromhmrb.coreimportSpacyCore# This will be used to turn a span (subsequence) of a spaCy document object# into a list of dictionaries input representation.defjsonify(span):jsn= []fortokeninspan:jsn.append({'orth':token.orth_,'lemma':token.lemma_,'pos':token.pos_,'tag':token.tag_        })returnjsnhmb=SpacyCore(callbacks={'mark_headache':mark_headache},map_doc=jsonify,sort_length=True)hmb.load(grammar)nlp=spacy.load('en_core_web_sm')nlp.add_pipe(hmb,last=True)nlp('My head hurts')#  I am acting on span "head hurts" with data#  "{#       'package': 'headache',#       'callback': 'mark_headache',#       'junk_attribute': 'pointless strings I am passing down because I can'#  }"

5. Tests & debugging

To run tests use (this inclused setting the correctHASH_SEED):

$ make tests

To display additional information for debugging purposes useDEBUG=1 environment variable.

$ DEBUG=1 python example.py

6. Maintainers

_{Kristian Boda}

_{Sasho Savkov}

_{Maria Lehl}

Made with

About

No description, website, or topics provided.

Releases6

Hammurabi v1.2.1 Latest

Jan 25, 2022

+ 5 releases

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Hammurabi [hmrb] 🏺

1. Installation

2. Documentation

3. Definitions

3.1 Writing Rules

3.2 Input format

3.3 Callbacks, labels and data

4. Usage

4.1 Worked-out example with callbacks

4.2 spaCy component example (NLP)

5. Tests & debugging

6. Maintainers

Made with

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases6

Uh oh!

Contributors3

Uh oh!

Languages

Movatterモバイル変換

License

babylonhealth/hmrb

Folders and files

Latest commit

History

Repository files navigation

Hammurabi [hmrb] 🏺

1. Installation

2. Documentation

3. Definitions

3.1 Writing Rules

3.2 Input format

3.3 Callbacks, labels and data

4. Usage

4.1 Worked-out example with callbacks

4.2 spaCy component example (NLP)

5. Tests & debugging

6. Maintainers

Made with

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases6

Uh oh!

Contributors3

Uh oh!

Languages