- Notifications
You must be signed in to change notification settings - Fork1
Entity linking evaluation and analysis tool
License
ad-freiburg/elevant
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
ELEVANT is a tool that helps you evaluate, analyse and compare entity linking systems in detail. You can explore ademo instance of the ELEVANT web app athttps://elevant.cs.uni-freiburg.de/. If you are using ELEVANT for yourresearch please cite our paper"ELEVANT: A Fully Automatic Fine-Grained Entity Linking Evaluation and Analysis Tool".
For the ELEVANT instance of the EMNLP 2023 paper"A Fair and In-Depth Evaluation of Existing End-to-End Entity Linking Systems"seehttps://elevant.cs.uni-freiburg.de/emnlp2023.
We summarized the most important information and instructions in this README. For further information, please checkourWiki.For a quick setup guide without lengthy explanations seeQuick Start.
Get the code, and build and start the docker container:
git clone https://github.com/ad-freiburg/elevant.git .docker build -t elevant .docker run -it -p 8000:8000 \ -v <data_directory>:/data \ -v $(pwd)/evaluation-results/:/home/evaluation-results \ -v $(pwd)/benchmarks/:/home/benchmarks \ -v /var/run/docker.sock:/var/run/docker.sock \ -v $(pwd)/wikidata-types/:/home/wikidata-types \ -e WIKIDATA_TYPES_PATH=$(pwd) elevant
where<data_directory>
is the directory in which the required data files will be stored. What these data files areand how they are generated is explained in sectionGet the Data. Make sure you can read from andwrite to all directories that are being mounted as volumes from within the docker container (i.e. your<data_directory>
,evaluation-results
andbenchmarks
), for example (if security is not an issue) by giving allusers read and write permissions to the directories in question with:
chmod a+rw -R <data_directory> evaluation-results/ benchmarks/ wikidata-types/
All the following commands should be run inside the docker container. If you want to use the system without docker,follow the instructions inSetup without Dockerbefore continuing with the next section.
Note: If you want to use a custom knowledge base instead of Wikidata/Wikipedia/DBpedia you can skip thisstep and instead follow the instructions inUsing a Custom Knowledge Base.
For linking entities in text or evaluating the output of a linker, our system needs information about entities andmention texts, e.g. entity names, aliases, popularity scores, types, the frequency with which a mention is linkedto a certain article in Wikipedia, etc. This information is stored in and read from several files. Since these filesare too large to upload them on GitHub, you can either download them from our servers (fast) or build them yourself(slow, RAM intensive, but the resulting files will be based on recent Wikidata and Wikipedia dumps).
To download the files from our servers, simply run
make download_all
This will automatically runmake download_wikidata_mappings
,make download_wikipedia_mappings
andmake download_entity_types_mapping
which will download the compressed files, extract them and move them to thecorrect location. SeeMapping Files for a description of files downloaded in these steps.
NOTE: This will overwrite existing Wikidata and Wikipedia mappings in your<data_directory>
so make sure this is whatyou want to do.
If you rather want to build the mappings yourself, you can runmake generate_all
to generate all required files, oralternatively, replace only a specificdownload command by the correspondinggenerate command.SeeData Generation for more details.
To start the evaluation web app, run
make start_webapp
You can then access the webapp athttp://0.0.0.0:8000/.
The evaluation results table contains one row for each experiment. In ELEVANT, an experiment is a run of aparticular entity linker with particular linker settings on a particular benchmark. We already added a few experiments,including oracle predictions (perfect linking results generated from the ground truth), so you can start exploringthe web app right away. The sectionAdd a Benchmark explains how you can add more benchmarksand the sectionAdd an Experiment explains how you can add more experiments yourself.
SeeEvaluation Web App for a detailed overview of the web app's features.
You can easily add a benchmark if you have a benchmark file that is in theJSONL format ELEVANT uses internally, in the common NLP Interchange Format (NIF), in the IOB-based formatused by Hoffart et al. for their AIDA/CoNLL benchmark or in a very simple JSONL format. Benchmarks in other formatshave to be converted into one of these formats first.
To add a benchmark, simply run
python3 add_benchmark.py <benchmark_name> -bfile <benchmark_file> -bformat <ours|nif|aida-conll|simple-jsonl>
This converts the<benchmark_file>
into our JSONL format (if it is not in this format already), annotates groundtruth labels with their Wikidata label and Wikidata types and writes the result to the filebenchmarks/<benchmark_name>.benchmark.jsonl
.
The benchmark can now be linked with a linker of your choice using thelink_benchmark.py
script with theparameter-b <benchmark_name>
. See sectionAdd an Experiment for details on how to link abenchmark and the supported formats.
SeeHow To Add A Benchmark for more details on adding a benchmark including a description of thesupported file formats.
Many popular entity linking benchmarks are already included in ELEVANT and can be used with ELEVANT's scripts out of thebox. SeeBenchmarks for a list of these benchmarks.
You can add an experiment, i.e. a row in the table for a particular benchmark, in two steps: 1) link the benchmarkarticles and 2) evaluate the linking results. Both steps are explained in the following two sections.
To link the articles of a benchmark with a single linker configuration, use the scriptlink_benchmark.py
:
python3 link_benchmark.py <experiment_name> -l <linker_name> -b <benchmark_name>
The linking results will be written toevaluation-results/<linker_name>/<adjusted_experiment_name>.<benchmark_name>.linked_articles.jsonl
where<adjusted_experiment_name>
is<experiment_name>
in lowercase and characters other than[a-z0-9-]
replaced by_
.For example
python3 link_benchmark.py Baseline -l baseline -b kore50
will create the fileevaluation-results/baseline/baseline.kore50.linked_articles.jsonl
. The result file containsone article as JSON object per line. Each JSON object contains benchmark article information such as the articletitle, text, and ground truth labels, as well as the entity mentions produced by the specified linker.<experiment_name>
is the name that will be displayed in the first column of the evaluation results table in theweb app.
Alternatively, you can use a NIF API as is used by GERBIL to link benchmark articles. For this, you need to providethe URL of the NIF API endpoint as follows:
python3 link_benchmark.py <experiment_name> -api <api_url> -pname <linker_name> -b <benchmark_name>
SeeLinking Benchmark Articles for more information on how to link benchmark articles, including information on howyou can transform your existing linking result files into our format, and instructions for how to link multiplebenchmarks using multiple linkers with a single command.
SeeLinkers for a list of linkers that can be used out of the box with ELEVANT.These are for exampleReFinED, OpenAI'sGPT (you'll need an OpenAI API key for that),REL,TagMe (you'llneed an access token for that which can be obtained easily and free of cost) andDBpediaSpotlight.
To evaluate a linker's predictions use the scriptevaluate.py
:
python3 evaluate.py <path_to_linking_result_file>
This will print precision, recall and F1 scores and create two new files where the.linked_articles.jsonl
fileextension is replaced by.eval_cases.jsonl
and.eval_results.json
respectively. For example
python3 evaluate.py evaluation-results/baseline/baseline.kore50.linked_articles.jsonl
will create the filesevaluation-results/baseline/baseline.kore50.eval_cases.jsonl
andevaluation-results/baseline/baseline.kore50.eval_results.json
. Theeval_cases
file contains information abouteach true positive, false positive and false negative case. Theeval_results
file contains the scores that are shownin the web app's evaluation results table.
In the web app, simply reload the page (you might have to disable caching) and the experiment will show up as a row inthe evaluation results table for the corresponding benchmark.
SeeEvaluating Linking Results for instructions on how to evaluate multiple linkingresults with a single command.
If you want to remove an experiment from the web app, simply (re)move the corresponding.linked_articles.jsonl
,.eval_cases.jsonl
and.eval_results.json
files from theevaluation-results/<linker_name>/
directory and reloadthe web app (again disabling caching).
About
Entity linking evaluation and analysis tool