Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.

License

NotificationsYou must be signed in to change notification settings

Olow304/memvid

Repository files navigation

Early-access notice
Memvid v1 is still experimental. The file format and API may change until we lock in a stable release.

Memvid v2 – what's next

  • Living-Memory Engine – keep adding new data and let LLMs remember it across sessions.
  • Capsule Context – shareable.mv2 capsules, each with its own rules and expiry.
  • Time-Travel Debugging – rewind or branch any chat to review or test.
  • Smart Recall – local cache guesses what you’ll need and loads it in under 5 ms.
  • Codec Intelligence – auto-tunes AV1 now and future codecs later, so files keep shrinking.
  • CLI & Dashboard – simple tools for branching, analytics, and one-command cloud publish.

Sneak peek of Memvid v2 - a living memory engine that can be used to chat with your knowledge base.Memvid v2 Preview


Memvid v1

PyPILicense: MITGitHub StarsPython 3.8+Code style: black

Memvid - Turn millions of text chunks into a single, searchable video file

Memvid compresses an entire knowledge base intoMP4 files while keeping millisecond-level semantic search. Think of it asSQLite for AI memory portable, efficient, and self-contained. By encoding text asQR codes in video frames, we deliver50-100× smaller storage than vector databases withzero infrastructure.


Why Video Compression Changes Everything 🚀

What it enablesHow video codecs make it possible
50-100× smaller storageModern video codecs compress repetitive visual patterns (QR codes) far better than raw embeddings
Sub-100ms retrievalDirect frame seek via index → QR decode → your text. No server round-trips
Zero infrastructureJust Python and MP4 files-no DB clusters, no Docker, no ops
True portabilityCopy or streammemory.mp4-it works anywhere video plays
Offline-first designAfter encoding, everything runs without internet

Under the Hood - Memvid v1 🔍

  1. Text → QR → Frame
    Each text chunk becomes a QR code, packed into video frames. Modern codecs excel at compressing these repetitive patterns.

  2. Smart indexing
    Embeddings map queries → frame numbers. One seek, one decode, millisecond results.

  3. Codec leverage
    30 years of video R&D means your text gets compressed better than any custom algorithm could achieve.

  4. Future-proof
    Next-gen codecs (AV1, H.266) automatically make your memories smaller and faster-no code changes needed.


Installation

pip install memvid# For PDF supportpip install memvid PyPDF2

Quick Start

frommemvidimportMemvidEncoder,MemvidChat# Create video memory from textchunks= ["NASA founded 1958","Apollo 11 landed 1969","ISS launched 1998"]encoder=MemvidEncoder()encoder.add_chunks(chunks)encoder.build_video("space.mp4","space_index.json")# Chat with your memorychat=MemvidChat("space.mp4","space_index.json")response=chat.chat("When did humans land on the moon?")print(response)# References Apollo 11 in 1969

Real-World Examples

Documentation Assistant

frommemvidimportMemvidEncoderimportosencoder=MemvidEncoder(chunk_size=512)# Index all markdown filesforfileinos.listdir("docs"):iffile.endswith(".md"):withopen(f"docs/{file}")asf:encoder.add_text(f.read(),metadata={"file":file})encoder.build_video("docs.mp4","docs_index.json")

PDF Library Search

# Index multiple PDFsencoder=MemvidEncoder()encoder.add_pdf("deep_learning.pdf")encoder.add_pdf("machine_learning.pdf")encoder.build_video("ml_library.mp4","ml_index.json")# Semantic search across all booksfrommemvidimportMemvidRetrieverretriever=MemvidRetriever("ml_library.mp4","ml_index.json")results=retriever.search("backpropagation",top_k=5)

Interactive Web UI

frommemvidimportMemvidInteractive# Launch at http://localhost:7860interactive=MemvidInteractive("knowledge.mp4","index.json")interactive.run()

Advanced Features

Scale Optimization

# Maximum compression for huge datasetsencoder.build_video("compressed.mp4","index.json",fps=60,# More frames/secondframe_size=256,# Smaller QR codesvideo_codec='h265',# Better compressioncrf=28# Quality tradeoff)

Custom Embeddings

fromsentence_transformersimportSentenceTransformermodel=SentenceTransformer('all-mpnet-base-v2')encoder=MemvidEncoder(embedding_model=model)

Parallel Processing

encoder=MemvidEncoder(n_workers=8)encoder.add_chunks_parallel(million_chunks)

CLI Usage

# Process documentspython examples/file_chat.py --input-dir /docs --provider openai# Advanced codecspython examples/file_chat.py --files doc.pdf --codec h265# Load existingpython examples/file_chat.py --load-existing output/memory

Performance

  • Indexing: ~10K chunks/second on modern CPUs
  • Search: <100ms for 1M chunks (includes decode)
  • Storage: 100MB text → 1-2MB video
  • Memory: Constant 500MB RAM regardless of size

What's Coming in v2

  • Delta encoding: Time-travel through knowledge versions
  • Streaming ingest: Add to videos in real-time
  • Cloud dashboard: Web UI with API management
  • Smart codecs: Auto-select AV1/HEVC per content
  • GPU boost: 100× faster bulk encoding

Get Involved

Memvid is redefining AI memory. Join us:

  • ⭐ Star onGitHub
  • 🐛 Report issues or request features
  • 🔧 Submit PRs (we review quickly!)
  • 💬 Discuss video-based AI memory

About

Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors4

  •  
  •  
  •  
  •  

[8]ページ先頭

©2009-2025 Movatter.jp