Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Code to help generate SQL for stakeholders. Code athttps://www.startdataengineering.com/post/data-democratize-llm/

NotificationsYou must be signed in to change notification settings

pvn-ptl/data_helper

 
 

Repository files navigation

Code for blog at:Democratize Data Access with RAGS

Set up

We will useLlamaIndex to build our RAG pipeline. The concepts used to RAG pipelines in general.

GitHub Repo:Data Helper

Pre-requisite

  1. Python 3.10+
  2. git
  3. Open AI API Key
  4. Poetry

Demo

We will clone the repo setup poetry shell as shown below:

git clone https://github.com/josephmachado/data_helper.gitcd data_helperpoetry installpoetry shell# activate the virtual env# To run the code, please set your OPEN AI API key as shown belowexport OPENAI_API_KEY=your-key-herepython run_code.py INDEX# Create an index with data from ./data folderpython run_code.py QUERY --query"show me for each buyers what date they made their first purchase"# The above command uses the already existing index to make a request to LLM API to get results# The code will return a SQL query with DuckDB formatpython run_code.py QUERY --query"for every seller, show me a monthly report of the number of unique products that they sold, avg cost per product, max/min value of product purchased that month"# The code will return a SQL query with DuckDB format

Next Steps

  1. Evaluate results and tune the pipeline
  2. Add observation system
  3. Monitor API costs
  4. Add additional documentation as input
  5. Explore other use cases such as RAGs for onboarding, DE training tool, etc

Further reading

  1. Production RAG tips
  2. Advanced RAG tuning
  3. What is a datawarehouse
  4. Conceptual data model

References

  1. LlamaIndex docs

About

Code to help generate SQL for stakeholders. Code athttps://www.startdataengineering.com/post/data-democratize-llm/

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python96.7%
  • Makefile3.3%

[8]ページ先頭

©2009-2025 Movatter.jp