Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
generated fromaarnphm/bazix

Pybind11 bindings for Whisper.cpp

License

NotificationsYou must be signed in to change notification settings

aarnphm/whispercpp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pybind11 bindings forwhisper.cpp

Quickstart

Install with pip:

pip install whispercpp

NOTE: We will setup a hermetic toolchain for all platforms that doesn't have aprebuilt wheels, (which means you don't have to setup anything to install thePython package) which will take a bit longer to install. Pass-vv topipto see the progress.

To use the latest version, install from source:

pip install git+https://github.com/aarnphm/whispercpp.git -vv

For local setup, initialize all submodules:

git submodule update --init --recursive

Build the wheel:

# Option 1: using pypa/buildpython3 -m build -w# Option 2: using bazel./tools/bazel build //:whispercpp_wheel

Install the wheel:

# Option 1: via pypa/buildpip install dist/*.whl# Option 2: using bazelpip install$(./tools/bazel info bazel-bin)/*.whl

The binding provides aWhisper class:

fromwhispercppimportWhisperw=Whisper.from_pretrained("tiny.en")

Currently, the inference API is provided viatranscribe:

w.transcribe(np.ones((1,16000)))

You can use any of your favorite audio libraries(ffmpeg orlibrosa, orwhispercpp.api.load_wav_file) to load audio files into a Numpy array, thenpass it totranscribe:

importffmpegimportnumpyasnptry:y,_= (ffmpeg.input("/path/to/audio.wav",threads=0)        .output("-",format="s16le",acodec="pcm_s16le",ac=1,ar=sample_rate)        .run(cmd=["ffmpeg","-nostdin"],capture_stdout=True,capture_stderr=True        )    )exceptffmpeg.Errorase:raiseRuntimeError(f"Failed to load audio:{e.stderr.decode()}")fromearr=np.frombuffer(y,np.int16).flatten().astype(np.float32)/32768.0w.transcribe(arr)

You can also use the modeltranscribe_from_file for convience:

w.transcribe_from_file("/path/to/audio.wav")

The Pybind11 bindings supports all of the features from whisper.cpp, that takesinspiration fromwhisper-rs

The binding can also be used viaapi:

fromwhispercppimportapi# Binding directly fromn whisper.cpp

Development

SeeDEVELOPMENT.md

APIs

Whisper

  1. Whisper.from_pretrained(model_name: str) -> Whisper

    Load a pre-trained model from the local cache or download and cache ifneeded. Supports loading a custom ggml model from a local path passed asmodel_name.

    w=Whisper.from_pretrained("tiny.en")w=Whisper.from_pretrained("/path/to/model.bin")

    The model will be saved to$XDG_DATA_HOME/whispercpp or~/.local/share/whispercpp if the environment variable is not set.

  2. Whisper.transcribe(arr: NDArray[np.float32], num_proc: int = 1)

    Running transcription on a given Numpy array. This callsfull fromwhisper.cpp. Ifnum_proc is greater than 1, it will usefull_parallelinstead.

    w.transcribe(np.ones((1,16000)))

    To transcribe from a WAV file usetranscribe_from_file:

    w.transcribe_from_file("/path/to/audio.wav")
  3. Whisper.stream_transcribe(*, length_ms: int=..., device_id: int=..., num_proc: int=...) -> Iterator[str]

    [EXPERIMENTAL] Streaming transcription. This callsstream_ fromwhisper.cpp. The transcription will be yielded as soon as it's available.Seestream.py for an example.

    Note: Thedevice_id is the index of the audio device. You can usewhispercpp.api.available_audio_devices to get the list of available audiodevices.

api

api is a direct binding fromwhisper.cpp, that has similar API towhisper-rs.

  1. api.Context

    This class is a wrapper aroundwhisper_context

    fromwhispercppimportapictx=api.Context.from_file("/path/to/saved_weight.bin")

    Note: The context can also be accessed from theWhisper class viaw.context

  2. api.Params

    This class is a wrapper aroundwhisper_params

    fromwhispercppimportapiparams=api.Params()

    Note: The params can also be accessed from theWhisper class viaw.params

Why not?

  • whispercpp.py. There are a few keydifferences here:

    • They provides the Cython bindings. From the UX standpoint, this achieves thesame goal aswhispercpp. The difference iswhispercpp use Pybind11instead. Feel free to use it if you prefer Cython over Pybind11. Note thatwhispercpp.py andwhispercpp are mutually exclusive, as they also usethewhispercpp namespace.
    • whispercpp provides similar APIs aswhisper-rs, which provides anicer UX to work with. There are literally two APIs (from_pretrained andtranscribe) to quickly use whisper.cpp in Python.
    • whispercpp doesn't pollute your$HOME directory, rather it follows theXDG Base Directory Specificationfor saved weights.
  • Usingcdll andctypes and be done with it?

    • This is also valid, but requires a lot of hacking and it is pretty slowcomparing to Cython and Pybind11.

Examples

Seeexamples for more information


[8]ページ先頭

©2009-2025 Movatter.jp