Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

MinishLab

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
@MinishLab

The Minish Lab

Solving big problems with small models

We're a two-person (@pringled and@stephantul) open-source company, with a focus on Natural Language Processing.

We believe that if you make models fast enough, you unlock new possibilities.

Using our software, you can:

  • Embed the entire English Wikipedia in 5 minutes
  • Classify tens of thousands of documents per second on CPU
  • Approximately deduplicate extremely large datasets in minutes
  • Build the fastest RAG application in the world
  • Easily evaluate which ANN algorithm works best for your data

Our projects:

  • model2vec: make tiny models that are still really really good.
  • potion: the best small model in the world. 100-500x faster than a sentence-transformer, and almost as good.
  • vicinity: consistent interfaces to many approximate nearest neighbor algorithms.
  • semhash: lightning-fast, super accuracte, approximate deduplication for your text datasets.

You can also find us on:

PinnedLoading

  1. model2vecmodel2vecPublic

    Fast State-of-the-Art Static Embeddings

    Python 1.1k 49

  2. semhashsemhashPublic

    Fast Semantic Text Deduplication

    Python 580 25

  3. vicinityvicinityPublic

    Lightweight Nearest Neighbors with Flexible Backends

    Python 259 8

  4. tokenlearntokenlearnPublic

    Pre-train Static Word Embeddings

    Python 49 3

Repositories

Loading
Type
Select type
Language
Select language
Sort
Select order
Showing 8 of 8 repositories
  • tokenlearn Public

    Pre-train Static Word Embeddings

    MinishLab/tokenlearn’s past year of commit activity
    Python 49MIT 3 2 1 UpdatedMar 7, 2025
  • vicinity Public

    Lightweight Nearest Neighbors with Flexible Backends

    MinishLab/vicinity’s past year of commit activity
    Python 259MIT 8 1 0 UpdatedMar 2, 2025
  • model2vec Public

    Fast State-of-the-Art Static Embeddings

    MinishLab/model2vec’s past year of commit activity
    Python 1,100MIT 49 4 3 UpdatedMar 2, 2025
  • semhash Public

    Fast Semantic Text Deduplication

    MinishLab/semhash’s past year of commit activity
    Python 580MIT 25 1 2 UpdatedFeb 28, 2025
  • .github Public

    Readme

    MinishLab/.github’s past year of commit activity
    00 0 0 UpdatedFeb 15, 2025
  • MinishLab/minishlab.github.io’s past year of commit activity
    SCSS0MIT0 0 0 UpdatedFeb 6, 2025
  • watertemplate Public template

    Template

    MinishLab/watertemplate’s past year of commit activity
    Makefile 2MIT 1 0 0 UpdatedDec 9, 2024
  • evaluation Public

    Code to evaluate performance for embeddings

    MinishLab/evaluation’s past year of commit activity
    Python 10MIT0 0 0 UpdatedSep 25, 2024

Top languages

Loading…

Most used topics

Loading…


[8]ページ先頭

©2009-2025 Movatter.jp