Nikro/MMM-WhisperGPTPublic

NotificationsYou must be signed in to change notification settings
Fork4
Star39

A Whisper + ChatGPT MagicMirror Module.

nikro.me/articles/professional/crafting-our-ai-assistant/

License

MIT license

39 stars 4 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
sounds		sounds
translations		translations
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.stylelintrc		.stylelintrc
CHANGELOG.md		CHANGELOG.md
Gruntfile.js		Gruntfile.js
LICENSE.txt		LICENSE.txt
MMM-WhisperGPT.css		MMM-WhisperGPT.css
MMM-WhisperGPT.js		MMM-WhisperGPT.js
README.md		README.md
docker-compose.yml		docker-compose.yml
node_helper.js		node_helper.js
package-lock.json		package-lock.json
package.json		package.json

Repository files navigation

MMM-WhisperGPT

This is a module for theMagicMirror².

How it works 👉https://nikro.me/articles/professional/crafting-our-ai-assistant/

Goal of the module is to create a custom interactive widget that uses Open AI tools:

Whisper - self-hosted model for voice-to-text transcription.
LangChain - intended to be used with ChatGPT API, to process the requests.
Picovoice -> Porcupine - is used for offline (self-hosted) word trigger (accent on the privacy).
also... mimic3 :)

Idea is the following:

Wake word (Porcupine).
...record query (show a sexy animation, will be done later)
...pass to self-hosted Whisper
...transcribe voice-to-text
Show the question as transcribed rendered-text (in the module render)
...pass through LangChain to ChatGPT
...pass the textual reply back to the module and render on-screen
...use TTS (mimic3) - self-hosted on the network, to throw back a wav file to play.

Using the module

To use this module, add the following configuration block to the modules array in theconfig/config.js file:

varconfig={modules:[{module:'MMM-WhisperGPT',config:{// See below for configurable optionspicovoiceKey:'xxx',picovoiceWord:'JARVIS',picovoiceSilenceTime:3,picovoiceSilenceThreshold:600,audioDeviceIndex:3,openAiKey:'xxx',openAiSystemMsg:'xxx',whisperUrl:'192.168.1.5:9000/asr',whisperMethod:'openai-whisper',mimic3Url:'192.168.1.6:59125'}}]}

Configuration options

Option	Required?	Description
`picovoiceKey`	Required	Picovoice access key - you have to register to obtain it - this is used for trigger word.
`picovoiceWord`	Optional	Picovoice trigger word, i.e. BUMBLEBEE, JARVIS, etc. Defaults to JARVIS.
`picovoiceSilenceTime`	Optional	Silence period - defaults to 3 (3 seconds).
`picovoiceSilenceThreshold`	Optional	This is usually background noise * THIS NUMBER. Default value is 1.1 (aka 10%).
`audioDeviceIndex`	Optional	Audio device - i.e. 3 - those will be printed out when you're using debug mode. Defaults to 0.
`whisperUrl`	Required	URL (or IP?) to self-hosted instance of the Whisper.
`whisperMethod`	Optional	Whisper method: openai-whisper or faster-whisper. Defaults to: faster-whisper.
`whisperLanguage`	Optional	Defaults to: en.
`openAiKey`	Required	API Key of OpenAI.
`openAiSystemMsg`	Optional	System msg - how the AI should behave.
`mimic3Url`	Required	Mimic3 URL (server), with protocol, port, without /api/tts
`mimic3Voice`	Optional	Mimic3 Voice - default: en_US/cmu-arctic_low%23gka
`debug`	Optional	If you want to debug, default is: false.

What is Picovoice / Porcupine

Picovoice /Porcupine is used for the "Trigger" word. It's a self-hosted small AI / Neural Network (NN). Picovoice offers a range of services, including a license for this offline AI. It only sends usage statistics, not the actual audio conversations.

What is Whisper

Whisper is an open-source product from OpenAI. It's a Large Language Model (LLM) AI that handles speech-to-text (transcription). In my personal case, I have it self-hosted on my local network.

I used this:https://github.com/ahmetoner/whisper-asr-webservice

What is ChatGPT

ChatGPT is another product from OpenAI. It's a Large Language Model (LLM) AI. You will need to register and get an API Key to use it.

What is LangChain

LangChain is a library built around LLMs that allows for extra functionality, such as long-term memory.

What is Mimic3 (Mycroft)

Mycroft's Mimic3 is a Text-to-Speech (TTS) system based on a Large Language Model (LLM). It offers realistic TTS that can run on somewhat resource-restricted systems. I initially tried to set it up on my OrangePi, but instead, I installed it on the same machine with Whisper and use it via the network.

I used this docker-compose.yml 😉

version:'3.7'services:mimic3:image:mycroftai/mimic3ports:      -59125:59125volumes:      -.:/home/mimic3/.local/share/mycroft/mimic3stdin_open:truetty:true

Troubleshooting

If your audio doesn't work - check if you're usingalsa orpulseaudio. You might need to installmpg123. You can install it using the commandsudo apt-get install mpg123.
You might also need to installlame for audio encoding. You can install it using the commandsudo apt-get install lame.

About

A Whisper + ChatGPT MagicMirror Module.

nikro.me/articles/professional/crafting-our-ai-assistant/

Releases

No releases published

Packages

No packages published

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

MMM-WhisperGPT

Using the module

Configuration options

What is Picovoice / Porcupine

What is Whisper

What is ChatGPT

What is LangChain

What is Mimic3 (Mycroft)

Troubleshooting

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Languages

Movatterモバイル変換

License

Nikro/MMM-WhisperGPT

Folders and files

Latest commit

History

Repository files navigation

MMM-WhisperGPT

Using the module

Configuration options

What is Picovoice / Porcupine

What is Whisper

What is ChatGPT

What is LangChain

What is Mimic3 (Mycroft)

Troubleshooting

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Languages

Packages