Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Add similarity service for gRPC#346

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
kozistr wants to merge3 commits intohuggingface:main
base:main
Choose a base branch
Loading
fromkozistr:feature/similarity-endpoint-for-grpc

Conversation

kozistr
Copy link
Contributor

What does this PR do?

ImplementSimilarity service for gRPC.

  • I just named the field namedistances followed byhere. maybesimilarities could be more proper naming i guess.
  • only accept one pair (source_sentence &sentence) forSimilarityStreamRequest in a similar manner asRerankStreamRequest.
  • I wonder if my implementation of theSimilarityStream rpc is correct. for now, just infersource_sentence andsentence sequentially in the closure,similarity_inner. maybe there must be a more efficient approach. any feedback is welcome : )

Logs

Server

2024-07-16T13:46:57.285050Z  INFO text_embeddings_router: router/src/main.rs:175: Args { model_id: "./mul*********-**-**rge", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: false, default_prompt_name: None, default_prompt: None, hf_api_token: None, hostname: "0.0.0.0", port: 8080, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: None, payload_limit: 2000000, api_key: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", cors_allow_origin: None }2024-07-16T13:46:57.737392Z  INFO text_embeddings_router: router/src/lib.rs:199: Maximum number of tokens per request: 5122024-07-16T13:46:57.741260Z  INFO text_embeddings_core::tokenization: core/src/tokenization.rs:28: Starting 4 tokenization workers2024-07-16T13:46:58.614438Z  INFO text_embeddings_router: router/src/lib.rs:241: Starting model backend2024-07-16T13:47:29.929738Z  WARN text_embeddings_router: router/src/lib.rs:267: Backend does not support a batch size > 82024-07-16T13:47:29.929764Z  WARN text_embeddings_router: router/src/lib.rs:268: forcing `max_batch_requests=8`2024-07-16T13:47:29.932608Z  INFO text_embeddings_router::grpc::server: router/src/grpc/server.rs:1810: Serving Prometheus metrics: 0.0.0.0:90002024-07-16T13:47:29.938856Z  INFO text_embeddings_router::grpc::server: router/src/grpc/server.rs:1954: Starting gRPC server: 0.0.0.0:80802024-07-16T13:47:29.938884Z  INFO text_embeddings_router::grpc::server: router/src/grpc/server.rs:1955: Ready2024-07-16T13:58:42.515709Z  INFO similarity{compute_chars=75 compute_tokens=21 total_time="104.540284ms" tokenization_time="151.525µs" queue_time="294.65µs" inference_time="103.940484ms"}: text_embeddings_router::grpc::server: router/src/grpc/server.rs:1507: Success

Client

$ grpcurl -d '{"source_sentence": "What is Deep Learning", "sentences": ["What is Machine Learning", "asdf", "hello"]}' -plaintext 0.0.0.0:8080 tei.v1.Similarity/Similarity{  "distances": [    0.927782,    0.7332565,    0.7520622  ],  "metadata": {    "computeChars": 75,    "computeTokens": 21,    "totalTimeNs": "104551084",    "tokenizationTimeNs": "151525",    "queueTimeNs": "294650",    "inferenceTimeNs": "103940484"  }}

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read thecontributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or theforum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@OlivierDehaene OR@Narsil

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers
No reviews
Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

1 participant
@kozistr

[8]ページ先頭

©2009-2025 Movatter.jp