- Notifications
You must be signed in to change notification settings - Fork2
Open
Description
🎯 Problem
Current indexing only source from.md and.rst files for simplicity, discarding valuable information in.pdf and.csv and other file types.
💡 Proposed Solution
For unified internal interface, convert all to.md usingdocling-project/docling while retaining metadata before indexing. This allows future-proof file type support, even for code files and others.
🤔 Alternatives Considered
Considered low level approaches such as usingpdfplumber, determined too complex given existing integrated solutions such asdocling.