- Notifications
You must be signed in to change notification settings - Fork897
Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.
License
Olow304/memvid
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Early-access notice
Memvid v1 is still experimental. The file format and API may change until we lock in a stable release.Memvid v2 – what's next
- Living-Memory Engine – keep adding new data and let LLMs remember it across sessions.
- Capsule Context – shareable
.mv2capsules, each with its own rules and expiry.- Time-Travel Debugging – rewind or branch any chat to review or test.
- Smart Recall – local cache guesses what you’ll need and loads it in under 5 ms.
- Codec Intelligence – auto-tunes AV1 now and future codecs later, so files keep shrinking.
- CLI & Dashboard – simple tools for branching, analytics, and one-command cloud publish.
Sneak peek of Memvid v2 - a living memory engine that can be used to chat with your knowledge base.
Memvid compresses an entire knowledge base intoMP4 files while keeping millisecond-level semantic search. Think of it asSQLite for AI memory portable, efficient, and self-contained. By encoding text asQR codes in video frames, we deliver50-100× smaller storage than vector databases withzero infrastructure.
| What it enables | How video codecs make it possible |
|---|---|
| 50-100× smaller storage | Modern video codecs compress repetitive visual patterns (QR codes) far better than raw embeddings |
| Sub-100ms retrieval | Direct frame seek via index → QR decode → your text. No server round-trips |
| Zero infrastructure | Just Python and MP4 files-no DB clusters, no Docker, no ops |
| True portability | Copy or streammemory.mp4-it works anywhere video plays |
| Offline-first design | After encoding, everything runs without internet |
Text → QR → Frame
Each text chunk becomes a QR code, packed into video frames. Modern codecs excel at compressing these repetitive patterns.Smart indexing
Embeddings map queries → frame numbers. One seek, one decode, millisecond results.Codec leverage
30 years of video R&D means your text gets compressed better than any custom algorithm could achieve.Future-proof
Next-gen codecs (AV1, H.266) automatically make your memories smaller and faster-no code changes needed.
pip install memvid# For PDF supportpip install memvid PyPDF2frommemvidimportMemvidEncoder,MemvidChat# Create video memory from textchunks= ["NASA founded 1958","Apollo 11 landed 1969","ISS launched 1998"]encoder=MemvidEncoder()encoder.add_chunks(chunks)encoder.build_video("space.mp4","space_index.json")# Chat with your memorychat=MemvidChat("space.mp4","space_index.json")response=chat.chat("When did humans land on the moon?")print(response)# References Apollo 11 in 1969
frommemvidimportMemvidEncoderimportosencoder=MemvidEncoder(chunk_size=512)# Index all markdown filesforfileinos.listdir("docs"):iffile.endswith(".md"):withopen(f"docs/{file}")asf:encoder.add_text(f.read(),metadata={"file":file})encoder.build_video("docs.mp4","docs_index.json")
# Index multiple PDFsencoder=MemvidEncoder()encoder.add_pdf("deep_learning.pdf")encoder.add_pdf("machine_learning.pdf")encoder.build_video("ml_library.mp4","ml_index.json")# Semantic search across all booksfrommemvidimportMemvidRetrieverretriever=MemvidRetriever("ml_library.mp4","ml_index.json")results=retriever.search("backpropagation",top_k=5)
frommemvidimportMemvidInteractive# Launch at http://localhost:7860interactive=MemvidInteractive("knowledge.mp4","index.json")interactive.run()
# Maximum compression for huge datasetsencoder.build_video("compressed.mp4","index.json",fps=60,# More frames/secondframe_size=256,# Smaller QR codesvideo_codec='h265',# Better compressioncrf=28# Quality tradeoff)
fromsentence_transformersimportSentenceTransformermodel=SentenceTransformer('all-mpnet-base-v2')encoder=MemvidEncoder(embedding_model=model)
encoder=MemvidEncoder(n_workers=8)encoder.add_chunks_parallel(million_chunks)
# Process documentspython examples/file_chat.py --input-dir /docs --provider openai# Advanced codecspython examples/file_chat.py --files doc.pdf --codec h265# Load existingpython examples/file_chat.py --load-existing output/memory
- Indexing: ~10K chunks/second on modern CPUs
- Search: <100ms for 1M chunks (includes decode)
- Storage: 100MB text → 1-2MB video
- Memory: Constant 500MB RAM regardless of size
- Delta encoding: Time-travel through knowledge versions
- Streaming ingest: Add to videos in real-time
- Cloud dashboard: Web UI with API management
- Smart codecs: Auto-select AV1/HEVC per content
- GPU boost: 100× faster bulk encoding
Memvid is redefining AI memory. Join us:
- ⭐ Star onGitHub
- 🐛 Report issues or request features
- 🔧 Submit PRs (we review quickly!)
- 💬 Discuss video-based AI memory
About
Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.
Topics
Resources
License
Contributing
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors4
Uh oh!
There was an error while loading.Please reload this page.