Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

The evaluation scripts of JMTEB (Japanese Massive Text Embedding Benchmark)

License

NotificationsYou must be signed in to change notification settings

sbintuitions/JMTEB

Repository files navigation

JMTEB is a benchmark for evaluating Japanese text embedding models. It consists of 5 tasks.

This is an easy-to-use evaluation script designed for JMTEB evaluation.

JMTEB leaderboard ishere. If you would like to submit your model, please refer to thesubmission guideline.

Quick start

git clone git@github.com:sbintuitions/JMTEBcd JMTEBpoetry installpoetry run pytest tests

The following command evaluate the specified model on the all the tasks in JMTEB.

poetry run python -m jmteb \  --embedder SentenceBertEmbedder \  --embedder.model_name_or_path"<model_name_or_path>" \  --save_dir"output/<model_name_or_path>"

Note

In order to gurantee the robustness of evaluation, a validation dataset is mandatorily required for hyperparameter tuning.For a dataset that doesn't have a validation set, we set the validation set the same as the test set.

By default, the evaluation tasks are read fromsrc/jmteb/configs/jmteb.jsonnet.If you want to evaluate the model on a specific task, you can specify the task via--evaluators option with the task config.

poetry run python -m jmteb \  --evaluators"src/configs/tasks/jsts.jsonnet" \  --embedder SentenceBertEmbedder \  --embedder.model_name_or_path"<model_name_or_path>" \  --save_dir"output/<model_name_or_path>"

Note

Some tasks (e.g., AmazonReviewClassification in classification, JAQKET and Mr.TyDi-ja in retrieval, esci in reranking) are time-consuming and memory-consuming. Heavy retrieval tasks take hours to encode the large corpus, and use much memory for the storage of such vectors. If you want to exclude them, add--eval_exclude "['amazon_review_classification', 'mrtydi', 'jaqket', 'esci']". Similarly, you can also use--eval_include to include only evaluation datasets you want.

Note

If you want to log model predictions to further analyze the performance of your model, you may want to use--log_predictions true to enable all evaluators to log predictions. It is also available to set whether to log in the config of evaluators.

Multi-GPU support

There are two ways to enable multi-GPU evaluation.

  • New classDataParallelSentenceBertEmbedder (here).
poetry run python -m jmteb \  --evaluators"src/configs/tasks/jsts.jsonnet" \  --embedder DataParallelSentenceBertEmbedder \  --embedder.model_name_or_path"<model_name_or_path>" \  --save_dir"output/<model_name_or_path>"
MODEL_NAME=<model_name_or_path>MODEL_KWARGS="\{\'torch_dtype\':\'torch.bfloat16\'\}"torchrun \    --nproc_per_node=$GPUS_PER_NODE --nnodes=1 \    src/jmteb/__main__.py --embedder TransformersEmbedder \    --embedder.model_name_or_path${MODEL_NAME} \    --embedder.pooling_mode cls \    --embedder.batch_size 4096 \    --embedder.model_kwargs${MODEL_KWARGS} \    --embedder.max_seq_length 512 \    --save_dir"output/${MODEL_NAME}" \    --evaluators src/jmteb/configs/jmteb.jsonnet

Note that the batch size here is global batch size (per_device_batch_size ×n_gpu).

About

The evaluation scripts of JMTEB (Japanese Massive Text Embedding Benchmark)

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors6


[8]ページ先頭

©2009-2025 Movatter.jp