- Notifications
You must be signed in to change notification settings - Fork1
Using A Custom Knowledge Base
This section explains the steps you need to take if you want to use ELEVANT to evaluate linking results for linkers andbenchmarks that link to a custom knowledge base or ontology.
Note that some features are not available when using a custom knowledge base. E.g. some error categories like metonyms,demonyms (which might not even make sense for your knowledge base) and rare errors can not be evaluated separately.
Instead of downloading the data files using themake download-all
command, perform the following stepswithin the docker container to setup ELEVANT for your custom KB:
Remove all subdirectories in
evaluation-results/
and all contents of thebenchmarks/
directory:rm evaluation-results/* -rrm benchmarks/*
The evaluation results and benchmarks contained in these folders per default are targeted at Wikidata / Wikipedia/ DBpedia.
Run the Python script
scripts/extract_custom_mappings.py
to extract the necessary name and type mappings from yourKB. For this script to work, your KB must be in the turtle (ttl) format.python3 scripts/extract_custom_mappings.py <custom_kb_in_ttl_format> --name_predicate <predicate_for_entity_name> --type_predicate <predicate_for_entity_type>
Per default, the predicate used to extract the entity name is
http://www.w3.org/2004/02/skos/core#prefLabel
andthe default predicate used to extract the entity type ishttp://www.w3.org/2000/01/rdf-schema#subClassOf
.This will create three tsv files in
<data_directory>/custom-mappings/
:entity_to_name.tsv
where the first column contains the entity URI and the second column contains the entityname, e.g.http://emmo.info/emmo/domain/fatigue#EMMO_c502dbc5-3a11-50c7-baf5-f3ef9c4fe636Glass Transition Temperature
entity_to_types.tsv
where the first column contains the entity URI and all further columns contain URIsof types of this entity, e.g.http://emmo.info/emmo/domain/fatigue#EMMO_6e8610b1-1717-53ff-a2ac-3d48950773fchttp://emmo.info/emmo/domain/fatigue#EMMO_15a16e99-19cb-5d5e-84d0-b74029837f28 http://emmo.info/emmo/domain/fatigue#EMMO_cfe4071d-224e-5ae9-abe5-083dc57ee6f9
whitelist_types.tsv
where the first column contains the URI of an entity type and the second column containsthe name for the entity type, e.g.http://emmo.info/emmo/domain/fatigue#EMMO_15a16e99-19cb-5d5e-84d0-b74029837f28 Mechanical Property
In the web app you will then be able to see evaluation results for each of these whitelist types individually.You can manually filter the set of whitelist types (this is especially important if you have a lot of entitytypes, e.g. > 50, because then your web app will become cluttered), but then make sure to only include types inthe
entity_to_types.tsv
file that are included in this whitelist.
If you don't have your knowledge base in ttl format or can't use the script for other reasons, it is enough tocreate the three tsv files mentioned above yourself and move them to a directory
<data_directory>/custom-mappings/
. A real world example for how to do this is given inUsing A Custom Knowledge Base Example.To add a benchmark that links mentions to your custom knowledge base, run the
add_benchmark.py
script with theoption-c
(forcustom KB). The supported benchmark formats for custom KB benchmarks arenif
andsimple-jsonl
. E.g.python3 add_benchmark.py <benchmark_name> -bfile <benchmark_file> -bformat <nif|simple-jsonl> -c
SeeHow To Add A Benchmark for more detailed information on adding a benchmark and the benchmarkformats.
To add linking results for such a benchmark to ELEVANT, run the
python3 link_benchmark.py
script with theoption-c
. The supported linking results formats for custom KB linking results arenif
andsimple-jsonl
. E.g.python3 link_benchmark.py <experiment_name> -pfile <linking_results_file> -pformat <nif|simple-jsonl> -b <benchmark_name> -c
SeeHow To Add An Experiment for more detailed information on adding linking resultsto ELEVANT and the linking results formats.
To evaluate the linking results, run the
evaluate.py
script with the option-c
, e.g.python3 evaluate.py <linking_result_file> -c
where
<linking_result_file>
is the file generated in the previous step. SeeEvaluating Linking Results for more detailed information.Before you start the web app for the first time, in the
evaluation-webapp
directory runln -s <data-directory>/custom-mappings/whitelist_types.tsv whitelist_types.tsv
You can then start the web app and inspect your linking results athttp://0.0.0.0:8000/ with
make start-webapp