Movatterモバイル変換

studio-ousia/lukePublic

NotificationsYou must be signed in to change notification settings
Fork100
Star725

LUKE -- Language Understanding with Knowledge-based Embeddings

License

Apache-2.0 license

725 stars 100 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 340 Commits
.circleci		.circleci
examples		examples
luke		luke
notebooks		notebooks
pretraining_config		pretraining_config
resources		resources
tests		tests
.flake8		.flake8
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
poetry_coling2020.lock		poetry_coling2020.lock
poetry_emnlp2020.lock		poetry_emnlp2020.lock
pretraining.md		pretraining.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Repository files navigation

LUKE (LanguageUnderstanding withKnowledge-basedEmbeddings) is a new pretrained contextualized representation of words andentities based on transformer. It was proposed in our paperLUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention.It achieves state-of-the-art results on important NLP benchmarks includingSQuAD v1.1 (extractivequestion answering),CoNLL-2003 (named entityrecognition),ReCoRD(cloze-style question answering),TACRED (relationclassification), andOpen Entity(entity typing).

This repository contains the source code to pretrain the model and fine-tune itto solve downstream tasks.

News

November 9, 2022: The large version of LUKE-Japanese is available

The large version of LUKE-Japanese is available on the Hugging Face Model Hub:

This model achieves state-of-the-art results on three datasets inJGLUE.

Model	MARC-ja	JSTS	JNLI	JCommonsenseQA
	acc	Pearson/Spearman	acc	acc
LUKE Japanese large	0.965	0.932/0.902	0.927	0.893
Baselines:
Tohoku BERT large	0.955	0.913/0.872	0.900	0.816
Waseda RoBERTa large (seq128)	0.954	0.930/0.896	0.924	0.907
Waseda RoBERTa large (seq512)	0.961	0.926/0.892	0.926	0.891
XLM RoBERTa large	0.964	0.918/0.884	0.919	0.840

October 27, 2022: The Japanese version of LUKE is available

The Japanese version of LUKE is now available on the Hugging Face Model Hub:

This model outperforms other base-sized models on four datasets inJGLUE.

Model	MARC-ja	JSTS	JNLI	JCommonsenseQA
	acc	Pearson/Spearman	acc	acc
LUKE Japanese base	0.965	0.916/0.877	0.912	0.842
Baselines:
Tohoku BERT base	0.958	0.909/0.868	0.899	0.808
NICT BERT base	0.958	0.910/0.871	0.902	0.823
Waseda RoBERTa base	0.962	0.913/0.873	0.895	0.840
XLM RoBERTa base	0.961	0.877/0.831	0.893	0.687

April 13, 2022: The mLUKE fine-tuning code is available

The example code is updated. Now it is based onallennlp andtransformers. You can reproducethe experiments in theLUKE andmLUKE papers with this implementation. Forthe details, please seeREADME.md under each example directory. The older codeused inthe LUKE paper has been moved toexamples/legacy.

April 13, 2022: The detailed instructions for pretraining LUKE models areavailable

For those interested in pretraining LUKE models, we explain how to preparedatasets and run the pretraining code onpretraining.md.

November 24, 2021: Entity disambiguation example is available

The example code of entity disambiguation based on LUKE has been added to thisrepository. This model was originally proposed inour paper, and achieved state-of-the-artresults on five standard entity disambiguation datasets: AIDA-CoNLL, MSNBC,AQUAINT, ACE2004, and WNED-WIKI.

For further details, please refer toexamples/entity_disambiguation.

August 3, 2021: New example code based on Hugging Face Transformers andAllenNLP is available

New fine-tuning examples of three downstream tasks, i.e.,NER,relationclassification, andentity typing, have been added to LUKE. These examplesare developed based on Hugging Face Transformers and AllenNLP. The fine-tuningmodels are defined using simple AllenNLP's Jsonnet config files!

The example code is available inexamples.

May 5, 2021: LUKE is added to Hugging Face Transformers

LUKE has been added to themaster branch of the Hugging Face Transformers library.You can now solve entity-related tasks (e.g., named entity recognition, relationclassification, entity typing) easily using this library.

For example, the LUKE-large model fine-tuned on the TACRED dataset can be usedas follows:

fromtransformersimportLukeTokenizer,LukeForEntityPairClassificationmodel=LukeForEntityPairClassification.from_pretrained("studio-ousia/luke-large-finetuned-tacred")tokenizer=LukeTokenizer.from_pretrained("studio-ousia/luke-large-finetuned-tacred")text="Beyoncé lives in Los Angeles."entity_spans= [(0,7), (17,28)]# character-based entity spans corresponding to "Beyoncé" and "Los Angeles"inputs=tokenizer(text,entity_spans=entity_spans,return_tensors="pt")outputs=model(**inputs)logits=outputs.logitspredicted_class_idx=int(logits[0].argmax())print("Predicted class:",model.config.id2label[predicted_class_idx])# Predicted class: per:cities_of_residence

We also provide the following three Colab notebooks that show how to reproduceour experimental results on CoNLL-2003, TACRED, and Open Entity datasets usingthe library:

Please refer to theofficial documentationfor further details.

November 5, 2021: LUKE-500K (base) model

We released LUKE-500K (base), a new pretrained LUKE model which is smaller thanexisting LUKE-500K (large). The experimental results of the LUKE-500K (base) andLUKE-500K (large) on SQuAD v1 and CoNLL-2003 are shown as follows:

Task	Dataset	Metric	LUKE-500K (base)	LUKE-500K (large)
Extractive Question Answering	SQuAD v1.1	EM/F1	86.1/92.3	90.2/95.4
Named Entity Recognition	CoNLL-2003	F1	93.3	94.3

We tuned only the batch size and learning rate in the experiments based onLUKE-500K (base).

Comparison with State-of-the-Art

LUKE outperforms the previous state-of-the-art methods on five important NLPtasks:

Task	Dataset	Metric	LUKE-500K (large)	Previous SOTA
Extractive Question Answering	SQuAD v1.1	EM/F1	90.2/95.4	89.9/95.1 (Yang et al., 2019)
Named Entity Recognition	CoNLL-2003	F1	94.3	93.5 (Baevski et al., 2019)
Cloze-style Question Answering	ReCoRD	EM/F1	90.6/91.2	83.1/83.7 (Li et al., 2019)
Relation Classification	TACRED	F1	72.7	72.0 (Wang et al. , 2020)
Fine-grained Entity Typing	Open Entity	F1	78.2	77.6 (Wang et al. , 2020)

These numbers are reported inour EMNLP 2020 paper.

Installation

LUKE can be installed usingPoetry:

poetry install# If you want to run pretraining for LUKEpoetry install --extras"pretraining opennlp"# If you want to run pretraining for mLUKEpoetry install --extras"pretraining icu"

The virtual environment automatically created by Poetry can be activated bypoetry shell.

A note on installingtorch

The pytorch installed viapoetry install does not necessarily match yourhardware. In such case, seethe official site andreinstall the correct version with thepip command.

poetry run pip3 uninstall torch torchvision torchaudio# Example for Linux with CUDA 11.3poetry run pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113

Released Models

Our pretrained models can be used with thetransformers library. The modeldocumentations can be found in the following links:LUKE andmLUKE.

Currently, the following models are available onthe Hugging Face Model Hub.

Name	model_name	Entity Vocab Size	Params
LUKE (base)	studio-ousia/luke-base	500K	253 M
LUKE (large)	studio-ousia/luke-large	500K	484 M
mLUKE (base)	studio-ousia/mluke-base	1.2M	586 M
mLUKE (large)	studio-ousia/mluke-large	1.2M	868 M
LUKE Japanese (base)	studio-ousia/luke-japanese-base	570K	281 M
LUKE Japanese (large)	studio-ousia/luke-japanese-large	570K	562 M

Lite Models

The entity embeddings cause a large memory footprint as they contain all theWikipedia entities that we used in pretraining. However, in some downstreamtasks (e.g., entity typing, named entity recognition, and relationclassification), we only need special entity embeddings such as[MASK]. Also,you may want to only use the word representations.

With such use-cases in mind, to make our models easier to use, we have uploadedlite models only with special entity embeddings. These models perform exactlythe same as the full models but have much fewer parameters, which enablefine-tuning the model with small GPUs.

Name	model_name	Params
LUKE (base)	studio-ousia/luke-base-lite	125 M
LUKE (large)	studio-ousia/luke-large-lite	356 M
mLUKE (base)	studio-ousia/mluke-base-lite	279 M
mLUKE (large)	studio-ousia/mluke-large-lite	561 M
LUKE Japanese (base)	studio-ousia/luke-japanese-base-lite	134 M
LUKE Japanese (large)	studio-ousia/luke-japanese-large-lite	415 M

Fine-tuning LUKE models

We release the fine-tuning code based onallennlp andtransformers underexamples. You can run fine-tuning experiments very easily withpre-defined config files and theallennlp train command. For the details andexample commands for each task, please see the task directory underexamples.

Pretraining LUKE models

The detailed instructions for pretraining luke models can be found onpretraining.md.

Citation

If you use LUKE in your work, please cite theoriginal paper.

@inproceedings{yamada-etal-2020-luke,    title = "{LUKE}: Deep Contextualized Entity Representations with Entity-aware Self-attention",    author = "Yamada, Ikuya  and      Asai, Akari  and      Shindo, Hiroyuki  and      Takeda, Hideaki  and      Matsumoto, Yuji",    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)",    year = "2020",    publisher = "Association for Computational Linguistics",    url = "https://aclanthology.org/2020.emnlp-main.523",    doi = "10.18653/v1/2020.emnlp-main.523",}

For mLUKE, please citethis paper.

@inproceedings{ri-etal-2022-mluke,    title = "m{LUKE}: {T}he Power of Entity Representations in Multilingual Pretrained Language Models",    author = "Ri, Ryokan  and      Yamada, Ikuya  and      Tsuruoka, Yoshimasa",    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",    year = "2022",    publisher = "Association for Computational Linguistics",    url = "https://aclanthology.org/2022.acl-long.505",}

About

LUKE -- Language Understanding with Knowledge-based Embeddings

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

News

Comparison with State-of-the-Art

Installation

Released Models

Lite Models

Fine-tuning LUKE models

Pretraining LUKE models

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors3

Uh oh!

Languages

Movatterモバイル変換

License

studio-ousia/luke

Folders and files

Latest commit

History

Repository files navigation

News

Comparison with State-of-the-Art

Installation

Released Models

Lite Models

Fine-tuning LUKE models

Pretraining LUKE models

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Contributors3

Uh oh!

Languages

Packages