Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

AddDense layer in2_Dense/ modules#660

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
alvarobartt wants to merge15 commits intomain
base:main
Choose a base branch
Loading
fromadd-dense
Open

Conversation

alvarobartt
Copy link
Member

@alvarobarttalvarobartt commentedJun 26, 2025
edited
Loading

What does this PR do?

This PR adds support for2_Dense/ modules, since some models as e.g.https://huggingface.co/sentence-transformers/LaBSE require the extraDense module i.e., an extraLinear layer on top of the pooled embeddings, when generating the embeddings.

So on, this PR introduces theDenseLayer trait, impl forDense and addsDenseConfig, which are models with basically a singleLinear layer, pulling the configuration from2_Dense/config.json and the model weights from2_Dense/model.safetensors.

Note

The2_Dense/ is only required when generating embeddings, meaning that it will only apply to theEmbedding model type, whereas theReranker andClassifier are not affected by this addition, so on, neither therank orpredict methods for the given backend.

This PR solves the issue recently reported athttps://discuss.huggingface.co/t/inference-result-not-aligned-with-local-version-of-same-model-and-revision/160514.

Additionally, this PR also fixes a shape mismatch issue produced when performing matrix multiplication of 2D tensors on Metal devices due to thecandle Metal kernels expecting the tensors to be contiguous. It seems that the error only arises on Metal for 2D tensors, where as for e.g. 3D tensors it seems to be working just fine without having to use.contiguous() (which is expensive as it needs to clone the tensor).

Reproduce

To ensure that the implementation was working fine and producing successful results i.e.,allclose like checks are true, and the cosine similarity is 1.0 (or as close as possible), the following test has been run:

  1. Deploy Text Embeddings Inference (TEI) as e.g.:
cargo run --release --features candle,http --no-default-features -- --model-id sentence-transformers/LaBSE --dtype float16
  1. Then, once it's running run the following Python script (requirestorch,transformers,sentence-transformers,accelerate andnumpy):
importnumpyasnpimportrequestsfromsentence_transformersimportSentenceTransformermodel=SentenceTransformer("sentence-transformers/LaBSE",model_kwargs={"torch_dtype":"float16","device_map":"mps",    },)out_py=model.encode("What is Deep Learning?",normalize_embeddings=True,convert_to_numpy=True,)response=requests.post("http://localhost:3000/embed",json={"inputs":"What is Deep Learning?","normalize":True,    },)response.raise_for_status()out=response.json()[0]out_http=np.array(out,dtype=np.float16)print(f"Embeddings are close:{np.allclose(out_py,out_http,atol=1e-3,rtol=1e-4)=}")defcosine_similarity(x:np.ndarray,y:np.ndarray)->float:dot_product=np.dot(x,y)norm_x=np.linalg.norm(x)norm_y=np.linalg.norm(y)returndot_product/ (norm_x*norm_y)print(f"The similarity score is:{cosine_similarity(out_py,out_http)=}")

It should produce the following on any combination of device (CPU, MPS, CUDA) and dtype (float32, float16):

Embeddings are close: np.allclose(out_py, out_http, atol=1e-3, rtol=1e-4)=TrueThe similarity score is: cosine_similarity(out_py, out_http)=np.float16(1.0)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read thecontributor guideline, Pull Request section?
  • Was this discussed/approved via a GitHub issue or theforum? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are thedocumentation guidelines, andhere are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@Narsil

kozistr reacted with thumbs up emoji
Apparently, `candle` expects the tensors to be contiguous on Metal whenperforming 2D matrix multiplication
@alvarobarttalvarobartt requested a review fromNarsilJune 26, 2025 16:26
@kozistr
Copy link
Contributor

@alvarobartt Hi! Just for your reference, you may already be aware, Stella v5 model usesIdentity layer as its activation function for2_Dense!2_Dense/config.json

alvarobartt reacted with thumbs up emoji

Copy link
Collaborator

@NarsilNarsil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Looks good, I think we can simplify a bit the parsing part.

If `--dense-path` was not allowed, that would prevent users from usingother `Dense` layers when available as per e.g.https://huggingface.co/NovaSearch/stella_en_400M_v5, that containsdifferent directories for different `Dense` layers with different outputvector dimensionality as `2_Dense_<dims>/`.
@alvarobarttalvarobartt changed the titleAddDense,DenseLayer andDenseConfig to handle2_Dense/AddDense layer in2_Dense/ modulesJul 2, 2025
@alvarobarttalvarobartt marked this pull request as ready for reviewJuly 3, 2025 08:57
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@NarsilNarsilNarsil left review comments

At least 1 approving review is required to merge this pull request.

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@alvarobartt@kozistr@Narsil

[8]ページ先頭

©2009-2025 Movatter.jp