- Notifications
You must be signed in to change notification settings - Fork4.5k
Real-time transcription on Raspberry Pi 4#166
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
It is possible to some extend to run Whisper in real-time mode on an embedded device such as the Raspberry Pi. Real-time with 4 seconds step
whisper-raspberry-2.mp4Real-time with 7.5 seconds step
whisper-raspberry-3.mp4Build instructions
More informationIn order to speed-up the processing, the Encoder's context is reduced from the original 1500 down to 512 (using the More detailed discussion can be found in this issue:#7 Explanation of what the |
BetaWas this translation helpful?Give feedback.
All reactions
❤️ 12🚀 5
Replies: 21 comments 28 replies
-
Fantastic work!! On my RPi 4 with Raspberry Pi OS Lite (bullseye) installed, I had to run
or the compiler would complain about a missing "SDL.h" header file. Now it works like a charm. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 17
-
I tried many approaches but still can not install libsdl2-dev on ubuntu2204. finally, i try to build the lib sdl2 from source code and it succeeded! |
BetaWas this translation helpful?Give feedback.
All reactions
-
Thanks, worked pretty well on the pi for me! (twitter clip ) Yeah, I had to install libsdl2-dev as@eternitybt mentioned as well. (for the mic, I used a ReSpeaker Mic Array v2) |
BetaWas this translation helpful?Give feedback.
All reactions
-
Hi@SethRobinson To use the ReSpeaker Mic Array, What are the changes we need to make? Thanks. |
BetaWas this translation helpful?Give feedback.
All reactions
-
Made a video of the install here:https://youtu.be/caaKhWcfcCY |
BetaWas this translation helpful?Give feedback.
All reactions
❤️ 6🚀 1
-
Hi, what is version Pi4 are you using? Is there a minimum memory requirement? I'm getting a 'Illegal instruction (core dumped)' when I try this on a 1GB Pi4. |
BetaWas this translation helpful?Give feedback.
All reactions
-
I think this now might be because I'm on the 64-bit version of the OS, might be worth confirming that these instructions are compiled for armv8 and not aarch64 which doesnt seem to be working right now for the Raspberry Pi 4, I believe? |
BetaWas this translation helpful?Give feedback.
All reactions
-
I'm running this on a 2GB Pi4 with 64Bit Raspberry Pi OS Lite. Definitely make sure you are compiling this for ARM architecture. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 3
-
Also, one should use the raspberry branch (after cloning the repository, type |
BetaWas this translation helpful?Give feedback.
All reactions
👀 1
-
can this be done on Raspberry? pi 3b+ model? I wanted to use is in speech recognition |
BetaWas this translation helpful?Give feedback.
All reactions
-
Give it a shot and let us know your results. As long as you have a USB mic and the Pi, it takes like 10 minutes to test. Probably would be best to start with the tiny model. |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
It works on my pi 3b+, ubuntu 22.04 |
BetaWas this translation helpful?Give feedback.
All reactions
-
hey, would you share the step to make it works on pi 3b+? |
BetaWas this translation helpful?Give feedback.
All reactions
-
Hello, I followed the build instructions on a Pi4 model B and am receiving this error: "fatal error: immintrin.h: No such file or directory" when attempting the make/build. |
BetaWas this translation helpful?Give feedback.
All reactions
-
Hi, I'm running Raspbian Bullseye 11 (64 bit). I have a 4GB ram pi. Thanks for reaching out! |
BetaWas this translation helpful?Give feedback.
All reactions
-
Nick, are you getting the error when compiling (make) stream.cpp, or bench.cpp, or main.cpp, quantize.cpp or all of them? When did you clone the github whisper repository? I wonder if I can zip my version which works and send it to you without the models. |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
After git cloning the repo and changing directories, I receive the error when attempting the initial "make -j stream". Here is the verbose error: `I whisper.cpp build info: cc -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -mcpu=native -c ggml.c -o ggml.o |
BetaWas this translation helpful?Give feedback.
All reactions
-
Here's what I'm seeing. When did you download and install? Today? You downloading to your desktop? Only difference I see is the +rpi1 here --> Raspbian 10.2.1-6**+rpi1**) When I tried to find immintrin.h This is what is gave me. What happens if you try this? I retried the install today and got it to work. Tried a different method this time and downloaded using the green code button --> zip folder at the top of the main page:https://github.com/ggerganov/whisper.cpp. Unzipped to the desktop then compiled using the make command and it worked. i@raspberrypi:~/Desktop/whisper.cpp-master $ make -j stream cc -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -mcpu=native -c ggml.c -o ggml.o |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
Hi there, I originally git cloned just before my last update (around 24 hrs ago) and have been putting it in a Desktop folder. I tried a sudo find and received the same "Permission denied" message that you did. I just tried the green button zip/unpack method and still received the immintrin.h error when running make, unfortunately. I'm not tied to Raspbian, so maybe I'll try your Debian distro to see if the issue persists! The new Raspberry Pi imager tool removes the default root account, but I did make sure my account was root and still received the issue. Update: I tried fresh installs of Debian 'Buster' and 'Bullseye', receiving the same error each time. I did some searching and the +rpi1 notation could be cross-compiler vs native compiler on my end. What image/distro did you flash your Pi with? |
BetaWas this translation helpful?Give feedback.
All reactions
-
I typed the following in terminal: pi@raspberrypi:~ $ uname -a Linux raspberrypi 6.1.29-v8+#1652 SMP PREEMPT Wed May 24 14:46:55 BST 2023 aarch64 Also typed: pi@raspberrypi:~ $ cat /etc/os-release PRETTY_NAME="Debian GNU/Linux 11 (bullseye)" Used the Raspberry Pi Imager v1.7.4 with Pi OS 64 bit, Debian Bullseye Desktop |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
Sorry for the delay, this notification didn't pop up for me! Thanks so much for the output, I went back and found a possible issue with the version of Pi Imager that I was using. It seems like it flashed a 32 bit Debian despite listing a 64 bit version, and I missed it. Trying the reflash now! Edit: That solved it! Initially received an SDL error, but that was resolved by installing libsdl2-dev! |
BetaWas this translation helpful?Give feedback.
All reactions
-
I am running on Raspberry Pi 4b and I can record through ffmpeg, but Stream has no output : ffmpeg -f pulse -i alsa_input.usb-C-Media_Electronics_Inc._USB_PnP_Sound_Device-00.analog-mono-ar 16000 -ac 1 recording.wav root@a0f34bc2c254:/whisper-cpp/whisper.cpp# ./stream -m ./models/ggml-tiny.bin -t 6 --step 0 --length 30000 -vth 0.6 whisper_model_load: adding 1608 extra tokens whisper_model_load: model size = 73.54 MB whisper_init_state: kv cross size = 8.79 MB main: processing 0 samples (step = 0.0 sec / len = 30.0 sec / keep = 0.0 sec), 6 threads, lang = en, task = transcribe, timestamps = 1 ... [Start speaking] |
BetaWas this translation helpful?Give feedback.
All reactions
-
Did you try the default example from above? ./stream -m models/ggml-tiny.en.bin --step 4000 --length 8000 -c 0 -t 4 -ac 512 I wasn't able to get your code to work either. ./stream -m ./models/ggml-tiny.bin -t 6 --step 0 --length 30000 -vth 0.6 Try taking the default example and add -vth 0.6 to the end for the voice activation detector (VAD) like below. Worked well for me. ./stream -m models/ggml-tiny.en.bin --step 4000 --length 8000 -c 0 -t 4 -ac 512 -vth 0.6 Also, the line below works with 6 threads which surprised me, because I thought the Raspberry Pi 4 could go up to 4 threads because it has 4 cores. In the task manager, the CPU usage would sometimes throttle up to near 100% when using either -t 4 or -t 6 ./stream -m models/ggml-tiny.en.bin --step 4000 --length 8000 -c 0 -t 6 -ac 512 -vth 0.6 I turned the -step down to 0 like below and it worked once then stopped working. ./stream -m models/ggml-tiny.en.bin --step 0 --length 8000 -c 0 -t 6 -ac 512 -vth 0.6 From what I've seen, upping the --step to 2000 works better and 4000 even better. |
BetaWas this translation helpful?Give feedback.
All reactions
-
The Raspberry Pi 4 is a bit slow, but some development boards equipped with the RK3588 chip have a 6 TOPS NPU. We should consider supporting these chips, as they could potentially enable "real" real-time transcription.@ggerganov |
BetaWas this translation helpful?Give feedback.
All reactions
👍 3
-
Works well on Orange Pi 5 with the RK3588S chip. Video here:https://www.youtube.com/watch?v=qgF4_moXcYQ |
BetaWas this translation helpful?Give feedback.
All reactions
👍 2
-
The most recent update can be found in#1557 |
BetaWas this translation helpful?Give feedback.
All reactions
-
Well, it takes several tens of seconds for a 3 second long wav file... |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
See the top of this page. It is aboutreal-time transcription on raspberry pi 4. Nothing to do with wav files. This is real-time and usable. |
BetaWas this translation helpful?Give feedback.
All reactions
-
There are several types of "real time" but this is none of those. With or without files, it takes ages... |
BetaWas this translation helpful?Give feedback.
All reactions
-
Whisper is working on the Raspberry Pi 5, up to the small model. Video here:https://youtu.be/W39teHesXus |
BetaWas this translation helpful?Give feedback.
All reactions
-
Very nice demonstration! The Pi 5 looks very powerful The quantum models speed-up only the Decoder, but the Encoder actually becomes a bit slower. So overall, we don't expect the quantum models to be faster - they have limited applications. |
BetaWas this translation helpful?Give feedback.
All reactions
-
Thank you@ggerganov. Just a heads up, I'm working on a Pi 5 voice assistant project below. It will turn GPIO outputs on and off via special phrases. I should have a video to share in the next few weeks. https://github.com/solarsamuel/pi5_whisper_voice_assistant I did a fresh whisper.cpp install today on my Pi 4 (testing for backward compatibility) and all the CPU's maxed at at 100% in gnome system monitor when I streamed with the tiny.en.bin model. This might be why there are some comments of frustration above from September. I noticed a few changes to stream.cpp like wave file stuff, but I'm not sure if this is the issue. Can you give the Pi 4 a shot with the latest whisper.cpp install and see how it runs? Does it max out for you? |
BetaWas this translation helpful?Give feedback.
All reactions
-
The new versions by default use 5 beams and 5 "best of" to match the reference Whisper implementation. This makes the decoding slower but more accurate. When you run on RPi, you might want to reduce these numbers and / or disable fallbacks all together. Looking forward to the voice assistant project! |
BetaWas this translation helpful?Give feedback.
All reactions
-
I figured out why I had issues with my Pi 4 a few weeks ago. I had 2 instances of Whisper running at the same time. 1 started automatically at bootup. Once I fixed this it worked fine. Here's a video of the voice assistant project running on a Raspberry Pi 5:https://youtu.be/jpW9foRIwv0 Use your voice and special phrases to turn outputs on and off. Turn on relays, buzzers, motors, lights, etc... Make your own special phrases. All speech-to-text is done with the Whisper C++ models on-device. IO is triggered with the GPIOD library. This is backwards compatible with Raspberry Pi 4. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 1
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
I managed to get Output from
Output from
For example, if I use I tried
But then no transcription, even using the sample jfk.wav. |
BetaWas this translation helpful?Give feedback.
All reactions
-
Hi! Great job, really! One question: is there any example/tutorials/guide/whatever on how to implements the same thing using whisper.cpp inside a python script? Thank you. |
BetaWas this translation helpful?Give feedback.
All reactions
-
I'm getting an error when compiling usingmake -j stream on a Raspberry Pi 5 running Pi OS Bookworm 12.2.0(uname -a following the build instructions I get an error from make which results in no ./stream folder
any ideas? thanks for any help! |
BetaWas this translation helpful?Give feedback.
All reactions
-
Will try to fix this today, but unfortunately my RPi4 stopped working, so I don't have hardware to test on |
BetaWas this translation helpful?Give feedback.
All reactions
👍 2
-
Yay! now working. 🥇 |
BetaWas this translation helpful?Give feedback.
All reactions
-
can you show how to fix this build errer ?thanks! |
BetaWas this translation helpful?Give feedback.
All reactions
-
For anyone stumbling on this error, |
BetaWas this translation helpful?Give feedback.
All reactions
-
And for people using older compilers, i worked around using the 32bits instruction defined above except those two not sure if it is going to give the expected behavior, but it compiles |
BetaWas this translation helpful?Give feedback.
All reactions
-
Hi@ggerganov i was trying to run the |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
Hi, I'm trying this on a Raspberry Pi 3, (tried on buster and bullseye) it compiles perfectly and the main example with the wav files works too. The problem is with the stream, it seems like it loses pieces or can't hear well, I'll start by saying that I tried both with a cheap USB microphone and with a Zoom H1 microphone at 44100hz 16bit, if you record with arecord the audio is perfect. I then tried to directly record what the stream program listens to by adding the Some extra info..this is what happens when I run the program:
related: |
BetaWas this translation helpful?Give feedback.
All reactions
-
Here's a video on how to run Whisper.cpp with Adafruit I2S microphones on the Raspberry Pi 5 and CM5:https://www.youtube.com/watch?v=V6yoFzcKVJ0 Here are the terminal commands: TO USE THE I2S MICROPHONES: Navigate to /boot/firmware/config.txt: cd /boot/firmware/config.txt OPEN CONFIG.TXT: Sudo nano config.txt Add the following line to the bottom of the config.txt file then hit CNTRL+X to save and exit: dtoverlay=googlevoicehat-soundcard REBOOT THE RASPBERRY PI: sudo reboot now SHOW THE MIC CARD NUMBER (MOST LIKELY 0, 1, or 2): arecord -l TEST MIC VOLUME BY TYPING THIS LINE AND TALKING INTO THE MIC(NOTE, put the CARD NUMBER after plughw:, for example plughw:0): arecord -D plughw:1 -c1 -r 48000 -f S32_LE -t wav -V mono -v file.wav TO USE WHISPER.CPP WITH ADAFRUIT I2S microphones AFTER DECEMBER 15th, 2024 in Raspberry Pi 5 Bookworm 64 bit Desktop OS: INSTALL CMAKE: sudo apt install -y cmake INSTAL SDL2 library: sudo apt install libsdl2-dev CHECK CMAKE VERSION (OPTIONAL) cmake --version GO TO YOUR DESIRED INSTALL DIRECTORY, FOR EXAMPLE Desktop cd Desktop CLONE THE WHISPER DIRECTORY: git clonehttps://github.com/ggerganov/whisper.cpp.git GO INTO THE NEW whisper.cpp DIRECTORY: cd whisper DOWNLOAD YOUR DESIRED MODEL (FOR example tiny.en, base.en, small.en will saturate the Pi 5 CPU's) sh ./models/download-ggml-model.sh base.en BUILD WITH CMAKE cmake -B build -DWHISPER_SDL2=ON RUN WHISPER.CPP ./build/bin/stream |
BetaWas this translation helpful?Give feedback.
All reactions
-
hey i was to run it on raspberry pi B+ but whenever i try to run the make -j stream command an error message keeps on popping up (make: *** No rule to make target 'stream'. Stop) |
BetaWas this translation helpful?Give feedback.
All reactions
-
CMake is now being used to build whisper.cpp and the stream example can be built using something like this: cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DWHISPER_SDL2=ON -DWHISPER_BUILD_EXAMPLES=ONcmake --build build -j 8 And the executable will be in (venv) $ build/bin/whisper-stream --helpusage: build/bin/whisper-stream [options]options: -h, --help [default] show this help message and exit -t N, --threads N [4 ] number of threads to use during computation --step N [3000 ] audio step size in milliseconds --length N [10000 ] audio length in milliseconds --keep N [200 ] audio to keep from previous step in ms -c ID, --capture ID [-1 ] capture device ID -mt N, --max-tokens N [32 ] maximum number of tokens per audio chunk -ac N, --audio-ctx N [0 ] audio context size (0 - all) -bs N, --beam-size N [-1 ] beam size for beam search -vth N, --vad-thold N [0.60 ] voice activity detection threshold -fth N, --freq-thold N [100.00 ] high-pass frequency cutoff -tr, --translate [false ] translate from source language to english -nf, --no-fallback [false ] do not use temperature fallback while decoding -ps, --print-special [false ] print special tokens -kc, --keep-context [false ] keep context between audio chunks -l LANG, --language LANG [en ] spoken language -m FNAME, --model FNAME [models/ggml-base.en.bin] model path -f FNAME, --file FNAME [ ] text output file name -tdrz, --tinydiarize [false ] enable tinydiarize (requires a tdrz model) -sa, --save-audio [false ] save the recorded audio to a file -ng, --no-gpu [false ] disable GPU inference -fa, --flash-attn [false ] flash attention during inference |
BetaWas this translation helpful?Give feedback.
All reactions
-
After some errors with make -j stream I managed to get it working on a raspberry pi 5 8Ram |
BetaWas this translation helpful?Give feedback.