huggingface/optimumPublic

NotificationsYou must be signed in to change notification settings
Fork596
Star3.1k

🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools

License

Apache-2.0 license

3.1k stars 596 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 1,268 Commits
.github		.github
docs		docs
notebooks		notebooks
optimum		optimum
tests		tests
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg
setup.py		setup.py

Repository files navigation

🤗 Optimum

Optimum is an extension of Transformers 🤖 Diffusers 🧨 TIMM 🖼️ and Sentence-Transformers 🤗, providing a set of optimization tools and enabling maximum efficiency to train and run models on targeted hardware, while keeping things easy to use.

Installation

Optimum can be installed usingpip as follows:

python -m pip install optimum

If you'd like to use the accelerator-specific features of Optimum, you can check the documentation and install the required dependencies according to the table below:

Accelerator	Installation
ONNX	`pip install --upgrade --upgrade-strategy eager optimum[onnx]`
ONNX Runtime	`pip install --upgrade --upgrade-strategy eager optimum[onnxruntime]`
ONNX Runtime GPU	`pip install --upgrade --upgrade-strategy eager optimum[onnxruntime-gpu]`
Intel Neural Compressor	`pip install --upgrade --upgrade-strategy eager optimum[neural-compressor]`
OpenVINO	`pip install --upgrade --upgrade-strategy eager optimum[openvino]`
IPEX	`pip install --upgrade --upgrade-strategy eager optimum[ipex]`
NVIDIA TensorRT-LLM	`docker run -it --gpus all --ipc host huggingface/optimum-nvidia`
AMD Instinct GPUs and Ryzen AI NPU	`pip install --upgrade --upgrade-strategy eager optimum[amd]`
AWS Trainum & Inferentia	`pip install --upgrade --upgrade-strategy eager optimum[neuronx]`
Intel Gaudi Accelerators (HPU)	`pip install --upgrade --upgrade-strategy eager optimum[habana]`
FuriosaAI	`pip install --upgrade --upgrade-strategy eager optimum[furiosa]`

The--upgrade --upgrade-strategy eager option is needed to ensure the different packages are upgraded to the latest possible version.

To install from source:

python -m pip install git+https://github.com/huggingface/optimum.git

For the accelerator-specific features, appendoptimum[accelerator_type] to the above command:

python -m pip install optimum[onnxruntime]@git+https://github.com/huggingface/optimum.git

Accelerated Inference

Optimum provides multiple tools to export and run optimized models on various ecosystems:

ONNX /ONNX Runtime, one of the most popular open formats for model export, and a high-performance inference engine for deployment.
OpenVINO, a toolkit for optimizing, quantizing and deploying deep learning models on Intel hardware.
ExecuTorch, PyTorch’s native solution for on-device inference across mobile and edge devices.
Intel Gaudi Accelerators enabling optimal performance on first-gen Gaudi, Gaudi2 and Gaudi3.
AWS Inferentia for accelerated inference on Inf2 and Inf1 instances.
NVIDIA TensorRT-LLM.

Theexport and optimizations can be done both programmatically and with a command line.

ONNX + ONNX Runtime

🚨🚨🚨 ONNX integration was moved tooptimum-onnx so make sure to follow the installation instructions 🚨🚨🚨

Before you begin, make sure you have all the necessary libraries installed :

pip install --upgrade --upgrade-strategy eager optimum[onnx]

It is possible to export Transformers, Diffusers, Sentence Transformers and Timm models to theONNX format and perform graph optimization as well as quantization easily.

For more information on the ONNX export, please check thedocumentation.

Once the model is exported to the ONNX format, we provide Python classes enabling you to run the exported ONNX model in a seamless manner usingONNX Runtime in the backend.

For this make sure you have ONNX Runtime installed, fore more information check out theinstallation instructions.

More details on how to run ONNX models withORTModelForXXX classeshere.

Intel (OpenVINO + Neural Compressor + IPEX)

Before you begin, make sure you have all the necessarylibraries installed.

You can find more information on the different integration in ourdocumentation and in the examples ofoptimum-intel.

ExecuTorch

Before you begin, make sure you have all the necessary libraries installed :

pip install optimum-executorch@git+https://github.com/huggingface/optimum-executorch.git

Users can export Transformers models toExecuTorch and run inference on edge devices within PyTorch's ecosystem.

For more information about export Transformers to ExecuTorch, please check the doc forOptimum-ExecuTorch.

Quanto

Quanto is a pytorch quantization backend which allows you to quantize a model either using the python API or theoptimum-cli.

You can see more details andexamples in theQuanto repository.

Accelerated training

Optimum provides wrappers around the original TransformersTrainer to enable training on powerful hardware easily.We support many providers:

Intel Gaudi Accelerators (HPU) enabling optimal performance on first-gen Gaudi, Gaudi2 and Gaudi3.
AWS Trainium for accelerated training on Trn1 and Trn1n instances.
ONNX Runtime (optimized for GPUs).

Intel Gaudi Accelerators

Before you begin, make sure you have all the necessary libraries installed :

pip install --upgrade --upgrade-strategy eager optimum[habana]

You can find examples in thedocumentation and in theexamples.

AWS Trainium

Before you begin, make sure you have all the necessary libraries installed :

pip install --upgrade --upgrade-strategy eager optimum[neuronx]

You can find examples in thedocumentation and in thetutorials.

About

🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools

huggingface.co/docs/optimum/main/

Code of conduct

Contributing

Activity

Custom properties

Stars

3.1k stars

Watchers

56 watching

Forks

596 forks

Report repository

Releases76

v2.0.0: Optimum ONNX, TF Lite, BetterTransformer Latest

Oct 9, 2025

+ 75 releases

Languages

Python99.8%
Other0.2%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

🤗 Optimum

Installation

Accelerated Inference

ONNX + ONNX Runtime

Intel (OpenVINO + Neural Compressor + IPEX)

ExecuTorch

Quanto

Accelerated training

Intel Gaudi Accelerators

AWS Trainium

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases76

Packages

Used by6k

Contributors157

Languages

Movatterモバイル変換

License

huggingface/optimum

Folders and files

Latest commit

History

Repository files navigation

🤗 Optimum

Installation

Accelerated Inference

ONNX + ONNX Runtime

Intel (OpenVINO + Neural Compressor + IPEX)

ExecuTorch

Quanto

Accelerated training

Intel Gaudi Accelerators

AWS Trainium

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases76

Packages0

Used by6k

Contributors157

Languages

Packages