Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Separate embedding kwargs into init kwargs and encode kwargs#1555

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation

tomaarsen
Copy link
Contributor

Resolves#1169

Hello!

Pull Request overview

  • Separate embedding kwargs into init kwargs and encode kwargs
  • Introduces support for custom code models viatrust_remote_code (e.g.pgml.embed trust_remote_code #1169)
  • Introduces support for private models viatoken (previously only possible via an environment variable, which FYI is still the recommended approach for security)
  • Introduces support for Matryoshka models such asthis Vietnamese one, which was trained such that embeddings can be truncated to smaller sizes with minimal performance loss & much faster retrieval, viatruncate_dim.
  • Introduces advanced loading support viamodel_kwargs/tokenizer_kwargs/config_kwargs. The first is most useful for inference, e.g. allowing loading models in lower precision for faster inference:model_kwargs={"torch_dtype": "bfloat16"}.

Details

This PR splitskwargs inpgml.embed into two types of kwargs: formodel = SentenceTransformer(..., **kwargs) and formodel.encode(..., **kwargs). This is currently done using a simple filter that checks for kwargs that are only (e.g.trust_remote_code) or primarily (e.g.device) relevant for the initialization.

I want to give a big preface that I have not tested this (!). My bandwidth is a bit too small this week for that I'm afraid. Another note is thatmodel_kwargs/tokenizer_kwargs/config_kwargs andtruncate_dim were only introduced in Sentence Transformers v3.0.0, whereas this project seems to be on v2.7 still. (FYI: ST v3.0 does not introduce breaking changes for inference, so upgrading should be safe).

  • Tom Aarsen

kczimm reacted with hooray emoji
@montanalowmontanalow self-requested a reviewJuly 12, 2024 14:41
@montanalowmontanalowforce-pushed thesentence_transformers_init_kwargs branch 2 times, most recently from470f2d3 to18be006CompareJuly 12, 2024 14:58
@montanalowmontanalowforce-pushed thesentence_transformers_init_kwargs branch from18be006 to465f38dCompareJuly 12, 2024 14:59
@montanalow
Copy link
Contributor

Thanks for the PR. I've added our embedding tests to CI, since we generally don't run the whole transformers suite due to the model download times. Confirmed thattrust_remote_code flag now works as expected.

@montanalowmontanalow merged commitdebd9ae intopostgresml:masterJul 12, 2024
@tomaarsen
Copy link
ContributorAuthor

Excellent, thank you for merging & writing some simple tests.

  • Tom Aarsen

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@montanalowmontanalowAwaiting requested review from montanalow

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

pgml.embed trust_remote_code
2 participants
@tomaarsen@montanalow

[8]ページ先頭

©2009-2025 Movatter.jp