Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.

License

NotificationsYou must be signed in to change notification settings

AlekseyKorshuk/optimum-transformers

Repository files navigation

TestsLicensePyPI

Accelerated NLP pipelines for fast inference 🚀 on CPU and GPU. Built with 🤗Transformers, Optimum and ONNX runtime.

Installation:

With PyPI:

pip install optimum-transformers

Or directly from GitHub:

pip install git+https://github.com/AlekseyKorshuk/optimum-transformers

Usage:

The pipeline API is similar to transformerspipelinewith just a few differences which are explained below.

Just provide the path/url to the model, and it'll download the model if needed fromthehub and automatically create onnx graph and run inference.

fromoptimum_transformersimportpipeline# Initialize a pipeline by passing the task name and# set onnx to True (default value is also True)nlp=pipeline("sentiment-analysis",use_onnx=True)nlp("Transformers and onnx runtime is an awesome combo!")# [{'label': 'POSITIVE', 'score': 0.999721109867096}]

Or provide a different model using themodel argument.

fromoptimum_transformersimportpipelinenlp=pipeline("question-answering",model="deepset/roberta-base-squad2",use_onnx=True)nlp(question="What is ONNX Runtime ?",context="ONNX Runtime is a highly performant single inference engine for multiple platforms and hardware")# {'answer': 'highly performant single inference engine for multiple platforms and hardware', 'end': 94,# 'score': 0.751201868057251, 'start': 18}
fromoptimum_transformersimportpipelinenlp=pipeline("ner",model="mys/electra-base-turkish-cased-ner",use_onnx=True,optimize=True,grouped_entities=True)nlp("adana kebap ülkemizin önemli lezzetlerinden biridir.")# [{'entity_group': 'B-food', 'score': 0.869149774312973, 'word': 'adana kebap'}]

Setuse_onnx toFalse for standard torch inference. Setoptimize toTrue for quantize with ONNX. ( setuse_onnx toTrue)

Supported pipelines

You can createPipeline objects for the following down-stream tasks:

  • feature-extraction: Generates a tensor representation for the input sequence
  • ner andtoken-classification: Generates named entity mapping for each word in the input sequence.
  • sentiment-analysis: Gives the polarity (positive / negative) of the whole input sequence. Can be used for any textclassification model.
  • question-answering: Provided some context and a question referring to the context, it will extract the answer to thequestion in the context.
  • text-classification: Classifies sequences according to a given number of classes from training.
  • zero-shot-classification: Classifies sequences according to a given number of classes directly in runtime.
  • fill-mask: The task of masking tokens in a sequence with a masking token, and prompting the model to fill that maskwith an appropriate token.
  • text-generation: The task of generating text according to the previous text provided.

Calling the pipeline for the first time loads the model, creates the onnx graph, and caches it for future use. Due tothis, the first load will take some time. Subsequent calls to the same model will load the onnx graph automatically fromthe cache.

Benchmarks

Note: For some reason, onnx is slow on colab notebook, so you won't notice any speed-up there. Benchmark it on your own hardware.

Check our example of benchmarking:example.

For detailed benchmarks and other information refer to this blog post and notebook.

Note: These results were collected on my local machine. So if you have high performance machine to benchmark, please contact me.

Benchmarksentiment-analysis pipeline

Benchmarkzero-shot-classification pipeline

Benchmarktoken-classification pipeline

Benchmarkquestion-answering pipeline

Benchmarkfill-mask pipeline

About

Built by Aleksey Korshuk

Follow

Follow

Follow

🚀 If you want to contribute to this project OR create something cool together — contactme:link

Star this repository:

GitHub stars

Resources

About

Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors4

  •  
  •  
  •  
  •  

[8]ページ先頭

©2009-2025 Movatter.jp