Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings
MinishLab

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

We're a two-person (@pringled and@stephantul) open-source lab, with a focus on Natural Language Processing.

We believe that if you make models fast enough, you unlock new possibilities.

Using our software, you can:

  • Embed the entire English Wikipedia in 5 minutes
  • Classify tens of thousands of documents per second on a CPU
  • Approximately deduplicate extremely large datasets in minutes
  • Build the fastest RAG application in the world
  • Easily evaluate which ANN algorithm works best for your data

Our projects:

  • model2vec: tiny static embedding models with state-of-the-art performance.
  • potion: the best small models in the world. 100-500x faster than a sentence-transformer, and almost as good.
  • vicinity: consistent interfaces to many approximate nearest neighbor algorithms.
  • semhash: lightning-fast, super accuracte, semantic deduplication and filtering for your text datasets.
  • model2vec-rs: a Rust port of model2vec.

You can also find us on:

PinnedLoading

  1. model2vecmodel2vecPublic

    Fast State-of-the-Art Static Embeddings

    Python 1.8k 93

  2. semhashsemhashPublic

    Fast Semantic Text Deduplication & Filtering

    Python 761 44

  3. vicinityvicinityPublic

    Lightweight Nearest Neighbors with Flexible Backends

    Python 294 8

  4. tokenlearntokenlearnPublic

    Pre-train Static Word Embeddings

    Python 84 8

  5. model2vec-rsmodel2vec-rsPublic

    Official Rust Implementation of Model2Vec

    Rust 122 5

Repositories

Loading
Type
Select type
Language
Select language
Sort
Select order
Showing 10 of 10 repositories

Top languages

Loading…

Most used topics

Loading…


[8]ページ先頭

©2009-2025 Movatter.jp