Movatterモバイル変換


[0]ホーム

URL:


Hugging Face's logoHugging Face

Hub documentation

Using BERTopic at Hugging Face

Hub

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces
Faster examples with accelerated inference
Switch between documentation themes

to get started

Using BERTopic at Hugging Face

BERTopic is a topic modeling framework that leverages 🤗 transformers and c-TF-IDF to create dense clusters allowing for easily interpretable topics whilst keeping important words in the topic descriptions.

BERTopic supports all kinds of topic modeling techniques:

GuidedSupervisedSemi-supervised
ManualMulti-topic distributionsHierarchical
Class-basedDynamicOnline/Incremental
MultimodalMulti-aspectText Generation/LLM
Zero-shot(new!)Merge Models(new!)Seed Words(new!)

Exploring BERTopic on the Hub

You can find BERTopic models by filtering at the left of themodels page.

BERTopic models hosted on the Hub have a model card with useful information about the models. Thanks to BERTopic Hugging Face Hub integration, you can load BERTopic models with a few lines of code. You can also deploy these models usingInference Endpoints.

Installation

To get started, you can follow theBERTopic installation guide.You can also use the following one-line install through pip:

pip install bertopic

Using Existing Models

All BERTopic models can easily be loaded from the Hub:

from bertopicimport BERTopictopic_model = BERTopic.load("MaartenGr/BERTopic_Wikipedia")

Once loaded, you can use BERTopic’s features to predict the topics for new instances:

topic, prob = topic_model.transform("This is an incredible movie!")topic_model.topic_labels_[topic]

Which gives us the following topic:

64_rating_rated_cinematography_film

Sharing Models

When you have created a BERTopic model, you can easily share it with others through the Hugging Face Hub. To do so, we can make use of thepush_to_hf_hub function that allows us to directly push the model to the Hugging Face Hub:

from bertopicimport BERTopic# Train modeltopic_model = BERTopic().fit(my_docs)# Push to HuggingFace Hubtopic_model.push_to_hf_hub(    repo_id="MaartenGr/BERTopic_ArXiv",    save_ctfidf=True)

Note that the saved model does not include the dimensionality reduction and clustering algorithms. Those are removed since they are only necessary to train the model and find relevant topics. Inference is done through a straightforward cosine similarity between the topic and document embeddings. This not only speeds up the model but allows us to have a tiny BERTopic model that we can work with.

Additional Resources

Update on GitHub


[8]ページ先頭

©2009-2026 Movatter.jp