NotificationsYou must be signed in to change notification settings
Fork566
Star6.2k

Fixing block size for Mistral-7B.#141

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Open

Artyom17 wants to merge3 commits intometa-pytorch:main

base:main

Choose a base branch

fromSesameAILabs:art/fix-mistral

Open

Fixing block size for Mistral-7B.#141

Artyom17 wants to merge3 commits intometa-pytorch:mainfromSesameAILabs:art/fix-mistral

Conversation

Copy link

Contributor

Artyom17 commentedMar 19, 2024

According to Mistral's paper the block size for Mistral-7B should be 8192 (ref:https://arxiv.org/pdf/2310.06825.pdf,https://huggingface.co/docs/transformers/en/model_doc/mistral). But currently it is set to the default value (2048).

Fixing block size for Mistral-7B.

cc4cbec

facebook-github-bot added the CLA SignedThis label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label

Mar 19, 2024

Saving some memory on freq_cis tensor with big block_size but small m…

8ca548c

…ax_seq_length

Copy link

ContributorAuthor

Artyom17 commentedMar 19, 2024

It also saves some memory on 'freq_cis' tensor when the large block_size is used with relatively small max_seq_length.

Merge branch 'main' into art/fix-mistral

f21da73

Labels

CLA Signed

This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fixing block size for Mistral-7B.#141

Are you sure you want to change the base?

Fixing block size for Mistral-7B.#141

Uh oh!

Conversation

Artyom17 commentedMar 19, 2024

Uh oh!

Artyom17 commentedMar 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants