- Notifications
You must be signed in to change notification settings - Fork68
Pybind11 bindings for Whisper.cpp
License
aarnphm/whispercpp
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Pybind11 bindings forwhisper.cpp
Install with pip:
pip install whispercpp
NOTE: We will setup a hermetic toolchain for all platforms that doesn't have aprebuilt wheels, (which means you don't have to setup anything to install thePython package) which will take a bit longer to install. Pass
-vvtopipto see the progress.
To use the latest version, install from source:
pip install git+https://github.com/aarnphm/whispercpp.git -vv
For local setup, initialize all submodules:
git submodule update --init --recursive
Build the wheel:
# Option 1: using pypa/buildpython3 -m build -w# Option 2: using bazel./tools/bazel build //:whispercpp_wheel
Install the wheel:
# Option 1: via pypa/buildpip install dist/*.whl# Option 2: using bazelpip install$(./tools/bazel info bazel-bin)/*.whl
The binding provides aWhisper class:
fromwhispercppimportWhisperw=Whisper.from_pretrained("tiny.en")
Currently, the inference API is provided viatranscribe:
w.transcribe(np.ones((1,16000)))
You can use any of your favorite audio libraries(ffmpeg orlibrosa, orwhispercpp.api.load_wav_file) to load audio files into a Numpy array, thenpass it totranscribe:
importffmpegimportnumpyasnptry:y,_= (ffmpeg.input("/path/to/audio.wav",threads=0) .output("-",format="s16le",acodec="pcm_s16le",ac=1,ar=sample_rate) .run(cmd=["ffmpeg","-nostdin"],capture_stdout=True,capture_stderr=True ) )exceptffmpeg.Errorase:raiseRuntimeError(f"Failed to load audio:{e.stderr.decode()}")fromearr=np.frombuffer(y,np.int16).flatten().astype(np.float32)/32768.0w.transcribe(arr)
You can also use the modeltranscribe_from_file for convience:
w.transcribe_from_file("/path/to/audio.wav")
The Pybind11 bindings supports all of the features from whisper.cpp, that takesinspiration fromwhisper-rs
The binding can also be used viaapi:
fromwhispercppimportapi# Binding directly fromn whisper.cpp
Whisper.from_pretrained(model_name: str) -> WhisperLoad a pre-trained model from the local cache or download and cache ifneeded. Supports loading a custom ggml model from a local path passed as
model_name.w=Whisper.from_pretrained("tiny.en")w=Whisper.from_pretrained("/path/to/model.bin")
The model will be saved to
$XDG_DATA_HOME/whispercppor~/.local/share/whispercppif the environment variable is not set.Whisper.transcribe(arr: NDArray[np.float32], num_proc: int = 1)Running transcription on a given Numpy array. This calls
fullfromwhisper.cpp. Ifnum_procis greater than 1, it will usefull_parallelinstead.w.transcribe(np.ones((1,16000)))
To transcribe from a WAV file use
transcribe_from_file:w.transcribe_from_file("/path/to/audio.wav")
Whisper.stream_transcribe(*, length_ms: int=..., device_id: int=..., num_proc: int=...) -> Iterator[str][EXPERIMENTAL] Streaming transcription. This calls
stream_fromwhisper.cpp. The transcription will be yielded as soon as it's available.Seestream.py for an example.Note: The
device_idis the index of the audio device. You can usewhispercpp.api.available_audio_devicesto get the list of available audiodevices.
api is a direct binding fromwhisper.cpp, that has similar API towhisper-rs.
api.ContextThis class is a wrapper around
whisper_contextfromwhispercppimportapictx=api.Context.from_file("/path/to/saved_weight.bin")
Note: The context can also be accessed from the
Whisperclass viaw.contextapi.ParamsThis class is a wrapper around
whisper_paramsfromwhispercppimportapiparams=api.Params()
Note: The params can also be accessed from the
Whisperclass viaw.params
whispercpp.py. There are a few keydifferences here:
- They provides the Cython bindings. From the UX standpoint, this achieves thesame goal as
whispercpp. The difference iswhispercppuse Pybind11instead. Feel free to use it if you prefer Cython over Pybind11. Note thatwhispercpp.pyandwhispercppare mutually exclusive, as they also usethewhispercppnamespace. whispercppprovides similar APIs aswhisper-rs, which provides anicer UX to work with. There are literally two APIs (from_pretrainedandtranscribe) to quickly use whisper.cpp in Python.whispercppdoesn't pollute your$HOMEdirectory, rather it follows theXDG Base Directory Specificationfor saved weights.
- They provides the Cython bindings. From the UX standpoint, this achieves thesame goal as
Using
cdllandctypesand be done with it?- This is also valid, but requires a lot of hacking and it is pretty slowcomparing to Cython and Pybind11.
Seeexamples for more information
About
Pybind11 bindings for Whisper.cpp
Topics
Resources
License
Security policy
Uh oh!
There was an error while loading.Please reload this page.