Commit54d27c6

authored

Add chat-llama-nemotron example as a regular directory (NVIDIA#307)

* Add chat-llama-nemotron example as a regular directory* added spdx and referenced the demo in the community readme* minor formatting fix* made the title more descriptive* minor improvements

1 parent6e96746 commit54d27c6Copy full SHA for 54d27c6

File tree

32 files changed

+4839

-1

lines changed

community
- README.md
- chat-llama-nemotron
  - .gitignore
  - README.md
  - backend-dynamo
    - .gitignore
    - README.md
    - config
      - agg_llama_nemotron_4b.yaml
    - llm-proxy
  - backend-rag
  - frontend
    - .gitignore
    - README.md
    - package.json
    - public
      - config
        app_config.yaml
      - index.html
    - src
      - App.css
      - App.js
      - components
        FileIngestion.css
        FileIngestion.js
      - config
        app_config.yaml
        config_loader.js
      - index.css
      - index.js

32 files changed

+4839

-1

lines changed

`‎community/README.md‎`

Lines changed: 5 additions & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -82,4 +82,8 @@ Community examples are sample code and deployments for RAG pipelines that are no`
`82`	`82`
`83`	`83`	`*[AI Podcast Assistant](./ai-podcast-assistant/)`
`84`	`84`
`85`		`- This example demonstrates a comprehensive workflow for processing podcast audio using the Phi-4-Multimodal LLM through NVIDIA NIM Microservices. It includes functionality for generating detailed notes from audio content, creating concise summaries, and translating both transcriptions and summaries into different languages. The implementation handles long audio files by automatically chunking them for efficient processing and preserves formatting during translation.`
	`85`	`+ This example demonstrates a comprehensive workflow for processing podcast audio using the Phi-4-Multimodal LLM through NVIDIA NIM Microservices. It includes functionality for generating detailed notes from audio content, creating concise summaries, and translating both transcriptions and summaries into different languages. The implementation handles long audio files by automatically chunking them for efficient processing and preserves formatting during translation.`
	`86`	`+`
	`87`	`+*[Chat with LLM Llama 3.1 Nemotron Nano 4B](./chat-llama-nemotron/)`
	`88`	`+`
	`89`	`+ This is a React-based conversational UI designed for interacting with a powerful local LLM. It incorporates RAG to enhance contextual understanding and is backed by an NVIDIA Dynamo inference server running the NVIDIA Llama-3.1-Nemotron-Nano-4B-v1.1 model. The setup enables low-latency, cloud-free AI assistant capabilities, with live document search and reasoning, all deployable on local or edge infrastructure.`

`‎community/chat-llama-nemotron/.gitignore‎`

Lines changed: 156 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,156 @@`
	`1`	`+# Dependencies`
	`2`	`+node_modules/`
	`3`	`+.pnp/`
	`4`	`+.pnp.js`
	`5`	`+package-lock.json`
	`6`	`+yarn.lock`
	`7`	`+`
	`8`	`+# Testing`
	`9`	`+coverage/`
	`10`	`+.nyc_output/`
	`11`	`+test-results/`
	`12`	`+junit.xml`
	`13`	`+`
	`14`	`+# Production`
	`15`	`+build/`
	`16`	`+dist/`
	`17`	`+out/`
	`18`	`+.next/`
	`19`	`+.nuxt/`
	`20`	`+.cache/`
	`21`	`+.output/`
	`22`	`+`
	`23`	`+# Environment files`
	`24`	`+.env`
	`25`	`+.env.*`
	`26`	`+!.env.example`
	`27`	`+.env.local`
	`28`	`+.env.development.local`
	`29`	`+.env.test.local`
	`30`	`+.env.production.local`
	`31`	`+.env*.local`
	`32`	`+*.env`
	`33`	`+`
	`34`	`+# Logs`
	`35`	`+npm-debug.log*`
	`36`	`+yarn-debug.log*`
	`37`	`+yarn-error.log*`
	`38`	`+logs/`
	`39`	`+*.log`
	`40`	`+debug.log`
	`41`	`+error.log`
	`42`	`+`
	`43`	`+# IDE`
	`44`	`+.idea/`
	`45`	`+.vscode/`
	`46`	`+*.swp`
	`47`	`+*.swo`
	`48`	`+*.sublime-workspace`
	`49`	`+*.sublime-project`
	`50`	`+.project`
	`51`	`+.classpath`
	`52`	`+.settings/`
	`53`	`+*.code-workspace`
	`54`	`+`
	`55`	`+# OS`
	`56`	`+.DS_Store`
	`57`	`+Thumbs.db`
	`58`	`+desktop.ini`
	`59`	`+$RECYCLE.BIN/`
	`60`	`+*.lnk`
	`61`	`+`
	`62`	`+# Python`
	`63`	`+__pycache__/`
	`64`	`+*.py[cod]`
	`65`	`+*$py.class`
	`66`	`+*.so`
	`67`	`+.Python`
	`68`	`+env/`
	`69`	`+venv/`
	`70`	`+ENV/`
	`71`	`+.env/`
	`72`	`+.venv/`
	`73`	`+pip-log.txt`
	`74`	`+pip-delete-this-directory.txt`
	`75`	`+.tox/`
	`76`	`+.coverage`
	`77`	`+.coverage.*`
	`78`	`+.cache`
	`79`	`+nosetests.xml`
	`80`	`+coverage.xml`
	`81`	`+*.cover`
	`82`	`+*.py,cover`
	`83`	`+.hypothesis/`
	`84`	`+.pytest_cache/`
	`85`	`+.python-version`
	`86`	`+*.egg-info/`
	`87`	`+.installed.cfg`
	`88`	`+*.egg`
	`89`	`+MANIFEST`
	`90`	`+dist/`
	`91`	`+build/`
	`92`	`+eggs/`
	`93`	`+parts/`
	`94`	`+bin/`
	`95`	`+var/`
	`96`	`+sdist/`
	`97`	`+develop-eggs/`
	`98`	`+.installed.cfg`
	`99`	`+lib/`
	`100`	`+lib64/`
	`101`	`+`
	`102`	`+# RAG specific`
	`103`	`+data/`
	`104`	`+embeddings/`
	`105`	`+*.faiss`
	`106`	`+*.pkl`
	`107`	`+*.bin`
	`108`	`+*.vec`
	`109`	`+*.model`
	`110`	`+*.index`
	`111`	`+chunks/`
	`112`	`+documents/`
	`113`	`+vectors/`
	`114`	`+corpus/`
	`115`	`+indexes/`
	`116`	`+`
	`117`	`+# Temporary files`
	`118`	`+*.tmp`
	`119`	`+*.temp`
	`120`	`+*.bak`
	`121`	`+*.swp`
	`122`	`+*~`
	`123`	`+*.swx`
	`124`	`+*.swo`
	`125`	`+*.swn`
	`126`	`+*.bak`
	`127`	`+*.orig`
	`128`	`+*.rej`
	`129`	`+*.patch`
	`130`	`+*.diff`
	`131`	`+`
	`132`	`+# Build artifacts`
	`133`	`+*.min.js`
	`134`	`+*.min.css`
	`135`	`+*.map`
	`136`	`+*.gz`
	`137`	`+*.br`
	`138`	`+*.zip`
	`139`	`+*.tar`
	`140`	`+*.tar.gz`
	`141`	`+*.tgz`
	`142`	`+*.rar`
	`143`	`+*.7z`
	`144`	`+`
	`145`	`+# Debug`
	`146`	`+.debug/`
	`147`	`+debug/`
	`148`	`+debug.log`
	`149`	`+npm-debug.log*`
	`150`	`+yarn-debug.log*`
	`151`	`+yarn-error.log*`
	`152`	`+`
	`153`	`+# Local development`
	`154`	`+.local/`
	`155`	`+local/`
	`156`	`+local.*`

`‎community/chat-llama-nemotron/README.md‎`

Lines changed: 135 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,135 @@`
	`1`	`+#Chat with Llama-3.1-Nemotron-Nano-4B-v1.1`
	`2`	`+`
	`3`	`+A React-based chat interface for interacting with an LLM, featuring RAG (Retrieval-Augmented Generation) capabilities and NVIDIA Dynamo backend serving NVIDIA Llama-3.1-Nemotron-Nano-4B-v1.1.`
	`4`	`+`
	`5`	`+##Project Structure`
	`6`	`+`
	`7`	+```
	`8`	`+.`
	`9`	`+├── frontend/ # React frontend application`
	`10`	`+├── backend-rag/ # RAG service backend`
	`11`	`+└── backend-dynamo/ # NVIDIA Dynamo backend service`
	`12`	`+ └── llm-proxy/ # Proxy server for NVIDIA Dynamo`
	`13`	+```
	`14`	`+`
	`15`	`+##Prerequisites`
	`16`	`+`
	`17`	`+- Node.js 18 or higher`
	`18`	`+- Python 3.8 or higher`
	`19`	`+- NVIDIA GPU with CUDA support (for LLM serving with NVIDIA Dynamo)`
	`20`	`+- Docker (optional, for containerized deployment)`
	`21`	`+- Git`
	`22`	`+`
	`23`	`+##Configuration`
	`24`	`+`
	`25`	`+###Frontend`
	`26`	`+`
	`27`	+The frontend configuration is managed through YAML files in`frontend/public/config/`:
	`28`	`+`
	`29`	+-`app_config.yaml`: Main application configuration:
	`30`	`+- API endpoints`
	`31`	`+- UI settings`
	`32`	`+- File upload settings`
	`33`	`+`
	`34`	`+See[frontend/README.md](frontend/README.md)`
	`35`	`+`
	`36`	`+###Backend`
	`37`	`+`
	`38`	`+Each service has its own configuration files:`
	`39`	`+`
	`40`	`+- RAG backend: see[backend-rag/README.md](backend-rag/README.md)`
	`41`	`+- LLM Proxy: see[backend-dynamo/llm-proxy/README.md](backend-dynamo/llm-proxy/README.md)`
	`42`	`+- DynamoDB backend: see[backend-dynamo/README.md](backend-dynamo/README.md)`
	`43`	`+`
	`44`	`+`
	`45`	`+##Setup`
	`46`	`+`
	`47`	`+###Llama-3.1-Nemotron-Nano-4B-v1.1 running on a GPU Server`
	`48`	`+`
	`49`	`+This step should be performed on a machine with a GPU.`
	`50`	`+`
	`51`	`+Set NVIDIA Dynamo backend running Llama-3.1-Nemotron-Nano-4B-v1.1 following the instruction[backend-dynamo/README.md](backend-dynamo/README.md).`
	`52`	`+`
	`53`	`+###Local client with a local RAG database`
	`54`	`+`
	`55`	`+These steps can be performed locally and don't require a GPU.`
	`56`	`+`
	`57`	`+1. Clone the repository:`
	`58`	+```bash
	`59`	`+ git clone<this-repository-url>`
	`60`	`+cd react-llama-client`
	`61`	+```
	`62`	`+`
	`63`	`+2. Install frontend dependencies:`
	`64`	+```bash
	`65`	`+cd frontend`
	`66`	`+ npm install`
	`67`	+```
	`68`	`+`
	`69`	`+3. Set up backend services:`
	`70`	`+`
	`71`	`+ For Unix/macOS:`
	`72`	+```bash
	`73`	`+# RAG Backend`
	`74`	`+cd backend-rag`
	`75`	`+ python -m venv venv`
	`76`	`+source venv/bin/activate`
	`77`	`+ pip install -r requirements.txt`
	`78`	`+`
	`79`	`+# LLM Proxy`
	`80`	`+cd backend-dynamo/llm-proxy`
	`81`	`+ python -m venv venv`
	`82`	`+source venv/bin/activate`
	`83`	`+ pip install -r requirements.txt`
	`84`	+```
	`85`	`+`
	`86`	`+ For Windows:`
	`87`	+```bash
	`88`	`+# RAG Backend`
	`89`	`+cd backend-rag`
	`90`	`+ python -m venv venv`
	`91`	`+ .\venv\Scripts\activate`
	`92`	`+ pip install -r requirements.txt`
	`93`	`+`
	`94`	`+# LLM Proxy`
	`95`	`+cd backend-dynamo\llm-proxy`
	`96`	`+ python -m venv venv`
	`97`	`+ .\venv\Scripts\activate`
	`98`	`+ pip install -r requirements.txt`
	`99`	+```
	`100`	`+`
	`101`	`+4. Start the services (each in a new terminal):`
	`102`	`+`
	`103`	`+ For Unix/macOS:`
	`104`	+```bash
	`105`	`+# Start frontend (in frontend directory)`
	`106`	`+cd frontend`
	`107`	`+ npm start`
	`108`	`+`
	`109`	`+# Start RAG backend (in backend-rag directory)`
	`110`	`+cd backend-rag`
	`111`	`+source venv/bin/activate`
	`112`	`+ python src/app.py`
	`113`	`+`
	`114`	`+# Start LLM proxy (in backend-dynamo/llm-proxy directory)`
	`115`	`+cd backend-dynamo/llm-proxy`
	`116`	`+source venv/bin/activate`
	`117`	`+ python proxy.py`
	`118`	+```
	`119`	`+`
	`120`	`+ For Windows:`
	`121`	+```bash
	`122`	`+# Start frontend (in frontend directory)`
	`123`	`+cd frontend`
	`124`	`+ npm start`
	`125`	`+`
	`126`	`+# Start RAG backend (in backend-rag directory)`
	`127`	`+cd backend-rag`
	`128`	`+ .\venv\Scripts\activate`
	`129`	`+ python src\app.py`
	`130`	`+`
	`131`	`+# Start LLM proxy (in backend-dynamo\llm-proxy directory)`
	`132`	`+cd backend-dynamo\llm-proxy`
	`133`	`+ .\venv\Scripts\activate`
	`134`	`+ python proxy.py`
	`135`	+```

`‎community/chat-llama-nemotron/backend-dynamo/.gitignore‎`

Lines changed: 43 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,43 @@`
	`1`	`+# Python`
	`2`	`+__pycache__/`
	`3`	`+*.py[cod]`
	`4`	`+*$py.class`
	`5`	`+*.so`
	`6`	`+.Python`
	`7`	`+env/`
	`8`	`+venv/`
	`9`	`+ENV/`
	`10`	`+.env/`
	`11`	`+.venv/`
	`12`	`+pip-log.txt`
	`13`	`+pip-delete-this-directory.txt`
	`14`	`+.tox/`
	`15`	`+.coverage`
	`16`	`+.coverage.*`
	`17`	`+.cache`
	`18`	`+nosetests.xml`
	`19`	`+coverage.xml`
	`20`	`+*.cover`
	`21`	`+*.py,cover`
	`22`	`+.hypothesis/`
	`23`	`+.pytest_cache/`
	`24`	`+`
	`25`	`+# Logs`
	`26`	`+logs/`
	`27`	`+*.log`
	`28`	`+`
	`29`	`+# IDE`
	`30`	`+.idea/`
	`31`	`+.vscode/`
	`32`	`+*.swp`
	`33`	`+*.swo`
	`34`	`+`
	`35`	`+# Environment variables`
	`36`	`+.env`
	`37`	`+.env.local`
	`38`	`+.env.*.local`
	`39`	`+`
	`40`	`+# AWS`
	`41`	`+.aws/`
	`42`	`+aws.json`
	`43`	`+credentials.json`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit54d27c6

File tree

32 files changed

32 files changed

`‎community/README.md‎`

`‎community/chat-llama-nemotron/.gitignore‎`

`‎community/chat-llama-nemotron/README.md‎`

`‎community/chat-llama-nemotron/backend-dynamo/.gitignore‎`

0 commit comments