Movatterモバイル変換

xlang-ai/BRIGHTPublic

NotificationsYou must be signed in to change notification settings
Fork15
Star147

BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval

brightbenchmark.github.io/

License

CC-BY-4.0 license

147 stars 15 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
configs		configs
figures		figures
scripts		scripts
.gitignore		.gitignore
Dataset_documentation.md		Dataset_documentation.md
LICENSE		LICENSE
README.md		README.md
eval_claude.py		eval_claude.py
generate_answer.py		generate_answer.py
generate_configs.py		generate_configs.py
reason.py		reason.py
requirements.txt		requirements.txt
rerank.py		rerank.py
retrievers.py		retrievers.py
run.py		run.py

Repository files navigation

Website •Paper •Data(42k downloads)

📢 Updates

2024-07-15: We released ourpaper,code, anddata. Check it out!

💾 Installation

In your local machine, we recommend to first create a virtual environment:

conda create -n bright python=3.10conda activate brightgit clone https://github.com/xlang-ai/BRIGHTcd BRIGHTconda install -n bright -c conda-forge openjdk=22sudo dpkg -ipip install -r requirements.txt

That will create the environment bright with all the required packages installed.

🤗 Data

BRIGHT comprises 12 diverse datasets, spanning biology, economics, robotics, math, code and more.The queries can be long StackExchange posts, math or code question.The documents can be blogs, news, articles, reports, etc.SeeHuggingface page for more details.

📊 Evaluation

We evaluate 13 representative retrieval models of diverse sizes and architectures. Run the following command to get results:

python run.py --task {task} --model {model}

--task: the task/dataset to evaluate. It can take one ofbiology,earth_science,economics,psychology,robotics,stackoverflow,sustainable_living,leetcode,pony,aops,theoremqa,
--model: the model to evaluate. Current implementation supportsbm25,cohere,e5,google,grit,inst-l,inst-xl,openai,qwen,sbert,sf,voyage andbge.
Optional:
--long_context: whether to evaluate on the long-context setting, default toFalse
--query_max_length: the maximum length for the query
--doc_max_length: the maximum length for the document
--encode_batch_size: the encoding batch size
--output_dir: the directory to output results
--cache_dir: the directory to cache document embeddings
--config_dir: the directory of instruction configurations
-checkpoint: the specific checkpoint to use
--key: key for proprietary models
--debug: whether to turn on the debug mode and load only a few documents

🔍 Add custom model?

It is very easy to add evaluate custom models on BRIGHT. Just implement the following function inretrievers.py and add it to the mappingRETRIEVAL_FUNCS:

defretrieval_model_function_name(queries,query_ids,documents,doc_ids,excluded_ids,**kwargs):    ...returnscores

wherescores is in the format:

{"query_id_1": {"doc_id_1": score_1,"doc_id_2": score_2,    ..."doc_id_n": socre_n  },  ..."query_id_m": {"doc_id_1": score_1,"doc_id_2": score_2,    ..."doc_id_n": socre_n  }}

❓Bugs or questions?

If you have any question related to the code or the paper, feel free to email Hongjin (hjsu@cs.hku.hk), Howard (hyen@cs.princeton.edu) or Mengzhou (mengzhou@cs.princeton.edu). Please try to specify the problem with details so we can help you better and quicker.

Citation

If you find our work helpful, please cite us:

@misc{BRIGHT,  title={BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval},  author={Su, Hongjin and Yen, Howard and Xia, Mengzhou and Shi, Weijia and Muennighoff, Niklas and Wang, Han-yu and Liu, Haisu and Shi, Quan and Siegel, Zachary S and Tang, Michael and Sun, Ruoxi and Yoon, Jinsung and Arik, Sercan O and Chen, Danqi and Yu, Tao},  url={https://arxiv.org/abs/2407.12883},  year={2024},}

About

BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval

brightbenchmark.github.io/

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

📢 Updates

💾 Installation

🤗 Data

📊 Evaluation

🔍 Add custom model?

❓Bugs or questions?

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors5

Uh oh!

Languages

Movatterモバイル変換

License

xlang-ai/BRIGHT

Folders and files

Latest commit

History

Repository files navigation

📢 Updates

💾 Installation

🤗 Data

📊 Evaluation

🔍 Add custom model?

❓Bugs or questions?

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Contributors5

Uh oh!

Languages

Packages