- Notifications
You must be signed in to change notification settings - Fork1
cocoindex-io/cocoindex-etl-with-document-ai
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
CocoIndex is an ETL framework to transform data for AI, with real-time incremental processing - keep index up to date with low latency on source update. It supports custom logic like LEGO, and makes it easy for users to plugin the modules that best suits their project.
In this example, we will walk you through how to build embedding index based on local files, usingGoogle Document AI as parser.
🥥 🌴 We are constantly improving - more blogs and examples coming soon. Stay tuned 👀 anddrop a star atCocoindex on Github for latest updates!
- Install Postgres if you don't have one.
- Configure Project and Processs ID for Document AI API
- Official Google document AI API
- Sign in toGoogle Cloud Console, create or open a project, and enable Document AI API.
- Create a processor in Document AI.
- update '.env' with
GOOGLE_CLOUD_PROJECT_ID
andGOOGLE_CLOUD_PROCESSOR_ID
.
Install dependencies:
pip install -e.
Setup:
cocoindex setup main.py
Update index:
cocoindex update main.py
Run:
cocoindex server -ci main.py
CocoInsight is in Early Access now (Free) 😊 You found us! A quick 3 minute video tutorial about CocoInsight:Watch on YouTube.
Run CocoInsight to understand your RAG data pipeline:
python main.py cocoindex server -c https://cocoindex.io
Then open the CocoInsight UI athttps://cocoindex.io/cocoinsight.
About
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.