forked fromjosephmachado/data_helper
- Notifications
You must be signed in to change notification settings - Fork0
Code to help generate SQL for stakeholders. Code athttps://www.startdataengineering.com/post/data-democratize-llm/
NotificationsYou must be signed in to change notification settings
pvn-ptl/data_helper
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Code for blog at:Democratize Data Access with RAGS
We will useLlamaIndex to build our RAG pipeline. The concepts used to RAG pipelines in general.
GitHub Repo:Data Helper
We will clone the repo setup poetry shell as shown below:
git clone https://github.com/josephmachado/data_helper.gitcd data_helperpoetry installpoetry shell# activate the virtual env# To run the code, please set your OPEN AI API key as shown belowexport OPENAI_API_KEY=your-key-herepython run_code.py INDEX# Create an index with data from ./data folderpython run_code.py QUERY --query"show me for each buyers what date they made their first purchase"# The above command uses the already existing index to make a request to LLM API to get results# The code will return a SQL query with DuckDB formatpython run_code.py QUERY --query"for every seller, show me a monthly report of the number of unique products that they sold, avg cost per product, max/min value of product purchased that month"# The code will return a SQL query with DuckDB format
- Evaluate results and tune the pipeline
- Add observation system
- Monitor API costs
- Add additional documentation as input
- Explore other use cases such as RAGs for onboarding, DE training tool, etc
About
Code to help generate SQL for stakeholders. Code athttps://www.startdataengineering.com/post/data-democratize-llm/
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages0
No packages published
Languages
- Python96.7%
- Makefile3.3%