Movatterモバイル変換


[0]ホーム

URL:


Skip to main contentLinkMenuExpand(external link)DocumentSearchCopyCopied

ONNX Runtime generate() API

Note: this API is in preview and is subject to change.

Run generative AI models with ONNX Runtime.

See the source code here:https://github.com/microsoft/onnxruntime-genai

This library provides the generative AI loop for ONNX models, including tokenization and other pre-processing, inference with ONNX Runtime, logits processing, search and sampling, and KV cache management.

Users can call a high levelgenerate() method, or run each iteration of the model in a loop, generating one token at a time, and optionally updating generation parameters inside the loop.

It has support for greedy/beam search and TopP, TopK sampling to generate token sequences and built-in logits processing like repetition penalties. You can also easily add custom scoring.

Other supported features include applying chat templates and structured output (for tool calling)


Table of contents



[8]ページ先頭

©2009-2025 Movatter.jp