Movatterモバイル変換

AutoLLM/ArxivDigestPublic

NotificationsYou must be signed in to change notification settings
Fork82
Star379

ArXiv Digest and Personalized Recommendations using Large Language Models

License

MIT license

379 stars 82 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
.github/workflows		.github/workflows
readme_images		readme_images
src		src
.env.template		.env.template
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
advanced_usage.md		advanced_usage.md
config.yaml		config.yaml
requirements.txt		requirements.txt

Repository files navigation

ArXiv Digest and Personalized Recommendations using Large Language Models.

This repo aims to provide a better daily digest for newly published arXiv papers based on your own research interests and natural-language descriptions, using relevancy ratings from GPT.

You can try it out onHugging Face using your own OpenAI API key.

You can also create a daily subscription pipeline to email you the results.

📚 Contents

🔍 What this repo does

Staying up to date onarXiv papers can take a considerable amount of time, with on the order of hundreds of new papers each day to filter through. There is anofficial daily digest service, however large categories likecs.AI still have 50-100 papers a day. Determining if these papers are relevant and important to you means reading through the title and abstract, which is time-consuming.

This repository offers a method to curate a daily digest, sorted by relevance, using large language models. These models are conditioned based on your personal research interests, which are described in natural language.

You modify the configuration fileconfig.yaml with an arXiv Subject, some set of Categories, and a natural language statement about the type of papers you are interested in.
The code pulls all the abstracts for papers in those categories and ranks how relevant they are to your interest on a scale of 1-10 usinggpt-3.5-turbo-16k.
The code then emits an HTML digest listing all the relevant papers, and optionally emails it to you usingSendGrid. You will need to have a SendGrid account with an API key for this functionality to work.

Testing it out with Hugging Face:

We provide a demo athttps://huggingface.co/spaces/AutoLLM/ArxivDigest. Simply enter yourOpenAI API key and then fill in the configuration on the right. Note that we do not store your key.

You can also send yourself an email of the digest by creating a SendGrid account andAPI key.

Some examples of results:

Digest Configuration:

Subject/Topic: Computer Science
Categories: Artificial Intelligence, Computation and Language
Interest:
- Large language model pretraining and finetunings
- Multimodal machine learning
- Do not care about specific application, for example, information extraction, summarization, etc.
- Not interested in paper focus on specific languages, e.g., Arabic, Chinese, etc.

Result:

Digest Configuration:

Subject/Topic: Quantitative Finance
Interest: "making lots of money"

Result:

💡 Usage

Running as a github action using SendGrid (Recommended).

The recommended way to get started using this repository is to:

Fork the repository
Modifyconfig.yaml and merge the changes into your main branch.
Set the following secrets(under settings, Secrets and variables, repository secrets). SeeAdvanced Usage for more details on how to create and get OpenAi and SendGrid API keys:
- OPENAI_API_KEY FromOpenAI
- SENDGRID_API_KEY FromSendGrid
- FROM_EMAIL This value must match the email you used to create the SendGrid API Key.
- TO_EMAIL
Manually trigger the action or wait until the scheduled action takes place.

SeeAdvanced Usage for more details, including step-by-step images, further customization, and alternate usage.

Running with a user interface

To locally run the same UI as the Huggign Face space:

Install the requirements insrc/requirements.txt as well asgradio.
Runpython src/app.py and go to the local URL. From there you will be able to preview the papers from today, as well as the generated digests.
If you want to use a.env file for your secrets, you can copy.env.template to.env and then set the environment variables in.env.

Note: These file may be hidden by default in some operating systems due to the dot prefix.
The .env file is one of the files in .gitignore, so git does not track it and it will not be uploaded to the repository.
Do not edit the original.env.template with your keys or your email address, since.template.env is tracked by git and editing it might cause you to commit your secrets.

WARNING: Do not edit and commit your.env.template with your personal keys or email address! Doing so may expose these to the world!

✅ Roadmap

Support personalized paper recommendation using LLM.
Send emails for daily digest.
Implement a ranking factor to prioritize content from specific authors.
Support open-source models, e.g., LLaMA, Vicuna, MPT etc.
Fine-tune an open-source model to better support paper ranking and stay updated with the latest research concepts..

💁 Extending and Contributing

You may (and are encourage to) modify the code in this repository to suit your personal needs. If you think your modifications would be in any way useful to others, please submit a pull request.

These types of modifications include things like changes to the prompt, different language models, or additional ways for the digest is delivered to you.

About

ArXiv Digest and Personalized Recommendations using Large Language Models

Languages

Python100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

📚 Contents

🔍 What this repo does

Testing it out with Hugging Face:

Some examples of results:

Digest Configuration:

Result:

Digest Configuration:

Result:

💡 Usage

Running as a github action using SendGrid (Recommended).

Running with a user interface

✅ Roadmap

💁 Extending and Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors6

Uh oh!

Languages

Movatterモバイル変換

License

AutoLLM/ArxivDigest

Folders and files

Latest commit

History

Repository files navigation

📚 Contents

🔍 What this repo does

Testing it out with Hugging Face:

Some examples of results:

Digest Configuration:

Result:

Digest Configuration:

Result:

💡 Usage

Running as a github action using SendGrid (Recommended).

Running with a user interface

✅ Roadmap

💁 Extending and Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Contributors6

Uh oh!

Languages

Packages