Olow304/memvidPublic

NotificationsYou must be signed in to change notification settings
Fork897
Star10.5k

Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.

License

MIT license

10.5k stars 897 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
assets		assets
data		data
docker		docker
examples		examples
memvid.egg-info		memvid.egg-info
memvid		memvid
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
USAGE.md		USAGE.md
mem.mp4		mem.mp4
prompt.md		prompt.md
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.py		setup.py

Repository files navigation

What to expect in v2

Early-access notice
Memvid v1 is still experimental. The file format and API may change until we lock in a stable release.
Memvid v2 – what's next
Living-Memory Engine – keep adding new data and let LLMs remember it across sessions.
Capsule Context – shareable.mv2 capsules, each with its own rules and expiry.
Time-Travel Debugging – rewind or branch any chat to review or test.
Smart Recall – local cache guesses what you’ll need and loads it in under 5 ms.
Codec Intelligence – auto-tunes AV1 now and future codecs later, so files keep shrinking.
CLI & Dashboard – simple tools for branching, analytics, and one-command cloud publish.

Sneak peek of Memvid v2 - a living memory engine that can be used to chat with your knowledge base.

Memvid v1

Memvid - Turn millions of text chunks into a single, searchable video file

Memvid compresses an entire knowledge base intoMP4 files while keeping millisecond-level semantic search. Think of it asSQLite for AI memory portable, efficient, and self-contained. By encoding text asQR codes in video frames, we deliver50-100× smaller storage than vector databases withzero infrastructure.

Why Video Compression Changes Everything 🚀

What it enables	How video codecs make it possible
50-100× smaller storage	Modern video codecs compress repetitive visual patterns (QR codes) far better than raw embeddings
Sub-100ms retrieval	Direct frame seek via index → QR decode → your text. No server round-trips
Zero infrastructure	Just Python and MP4 files-no DB clusters, no Docker, no ops
True portability	Copy or stream`memory.mp4`-it works anywhere video plays
Offline-first design	After encoding, everything runs without internet

Under the Hood - Memvid v1 🔍

Text → QR → Frame
Each text chunk becomes a QR code, packed into video frames. Modern codecs excel at compressing these repetitive patterns.
Smart indexing
Embeddings map queries → frame numbers. One seek, one decode, millisecond results.
Codec leverage
30 years of video R&D means your text gets compressed better than any custom algorithm could achieve.
Future-proof
Next-gen codecs (AV1, H.266) automatically make your memories smaller and faster-no code changes needed.

Installation

pip install memvid# For PDF supportpip install memvid PyPDF2

Quick Start

frommemvidimportMemvidEncoder,MemvidChat# Create video memory from textchunks= ["NASA founded 1958","Apollo 11 landed 1969","ISS launched 1998"]encoder=MemvidEncoder()encoder.add_chunks(chunks)encoder.build_video("space.mp4","space_index.json")# Chat with your memorychat=MemvidChat("space.mp4","space_index.json")response=chat.chat("When did humans land on the moon?")print(response)# References Apollo 11 in 1969

Real-World Examples

Documentation Assistant

frommemvidimportMemvidEncoderimportosencoder=MemvidEncoder(chunk_size=512)# Index all markdown filesforfileinos.listdir("docs"):iffile.endswith(".md"):withopen(f"docs/{file}")asf:encoder.add_text(f.read(),metadata={"file":file})encoder.build_video("docs.mp4","docs_index.json")

PDF Library Search

# Index multiple PDFsencoder=MemvidEncoder()encoder.add_pdf("deep_learning.pdf")encoder.add_pdf("machine_learning.pdf")encoder.build_video("ml_library.mp4","ml_index.json")# Semantic search across all booksfrommemvidimportMemvidRetrieverretriever=MemvidRetriever("ml_library.mp4","ml_index.json")results=retriever.search("backpropagation",top_k=5)

Interactive Web UI

frommemvidimportMemvidInteractive# Launch at http://localhost:7860interactive=MemvidInteractive("knowledge.mp4","index.json")interactive.run()

Advanced Features

Scale Optimization

# Maximum compression for huge datasetsencoder.build_video("compressed.mp4","index.json",fps=60,# More frames/secondframe_size=256,# Smaller QR codesvideo_codec='h265',# Better compressioncrf=28# Quality tradeoff)

Custom Embeddings

fromsentence_transformersimportSentenceTransformermodel=SentenceTransformer('all-mpnet-base-v2')encoder=MemvidEncoder(embedding_model=model)

Parallel Processing

encoder=MemvidEncoder(n_workers=8)encoder.add_chunks_parallel(million_chunks)

CLI Usage

# Process documentspython examples/file_chat.py --input-dir /docs --provider openai# Advanced codecspython examples/file_chat.py --files doc.pdf --codec h265# Load existingpython examples/file_chat.py --load-existing output/memory

Performance

Indexing: ~10K chunks/second on modern CPUs
Search: <100ms for 1M chunks (includes decode)
Storage: 100MB text → 1-2MB video
Memory: Constant 500MB RAM regardless of size

What's Coming in v2

Delta encoding: Time-travel through knowledge versions
Streaming ingest: Add to videos in real-time
Cloud dashboard: Web UI with API management
Smart codecs: Auto-select AV1/HEVC per content
GPU boost: 100× faster bulk encoding

Get Involved

Memvid is redefining AI memory. Join us:

⭐ Star onGitHub
🐛 Report issues or request features
🔧 Submit PRs (we review quickly!)
💬 Discuss video-based AI memory

About

Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.

www.memvid.com

Releases2

v0.1.3 - Memvid Latest

Jun 5, 2025

+ 1 release

Packages

No packages published

Movatterモバイル変換

License

Olow304/memvid

Folders and files

Latest commit

History

Repository files navigation

What to expect in v2

Memvid v1

Memvid - Turn millions of text chunks into a single, searchable video file

Why Video Compression Changes Everything 🚀

Under the Hood - Memvid v1 🔍

Installation

Quick Start

Real-World Examples

Documentation Assistant

PDF Library Search

Interactive Web UI

Advanced Features

Scale Optimization

Custom Embeddings

Parallel Processing

CLI Usage

Performance

What's Coming in v2

Get Involved

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases2

Packages0

Uh oh!

Contributors4

Uh oh!

Languages

Packages