Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

An Earley algorithm implementation which uses task parallelism.

License

NotificationsYou must be signed in to change notification settings

jfeser/earley

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

We present the LATE algorithm, an asynchronous variant of the Earley algorithm for parsing context free grammars. The Earley algorithm is naturally task based, but is difficult to parallelize because of dependencies between the tasks. We present the LATE algorithm, which uses additional datastructures to maintain information about the state of the parse so that work items may be processed in any order. This property allows the LATE algorithm to be sped up using task parallelism. We show that the LATE algorithm can achieve a 120x speedup over the Earley algorithm on a natural language task.

https://arxiv.org/abs/1807.05642

Building

  1. InstallThreading Building Blocks (TBB) 2018 Update 3.
  2. SetTBB_PREFIX in the Makefile to the correct installation prefix for TBB. It defaults to/opt/intel/tbb.
  3. SetCXX in the Makefile to your preferred compiler. It defaults toclang++-5.0.
  4. Runmake

All the relevant datasets exist in the repository, but the weak scaling benchmark inputs can be regenerated by runninggen-weak-scaling.sh.

Running

The primary interface to the parsing algorithms is a tool calledparse. It parses the input and outputs timing information in CSV format.

Usage: parse [-n N] [-p PARSER] GRAMMAR CORPUSArguments:  -n N        The number of threads. If not provided, jobs will be run from 1 to              # cores threads.  -p PARSER   The parsing algorithm to use. Must be one of: earley_serial, late_serial,               late_parallel. If not provided, all implementations will be run.  GRAMMAR     The grammar to parse.  CORPUS      The input sentence. Expected to be a space separated list of tokens.

ambiguate.py is a helper program which takes a seed grammar and generates a more ambiguous grammar which parses the same language.

Usage: ambiguate.py GRAMMAR NArguments:  GRAMMAR     The grammar to transform.  N           The number of times to replicate each nonterminal.

Running benchmarks

Run the benchmarks withrun-bench.sh. Results will be generated in theresults/ folder.Analysis.ipynb can be used to generate the plots.

About

An Earley algorithm implementation which uses task parallelism.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp