- Notifications
You must be signed in to change notification settings - Fork0
Mercury - Train your own custom GPT. Chat with any file, or website.
License
randomchristiancoder/ai-template
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Train your own custom GPT
- Train on specific websites that you define
- Train on documents you upload
- Builds on dialog with Chat History
- Cites sources
- Perplexity style UI
- .docx
- .md
- .txt
- .png
- .jpg
- .html
- .json
- .csv
- .pptx
- notion
- next 13 app dir
- vercel ai sdk
- file is uploaded -> cleaned to plain text, and split into 1000-character documents.
- OpenAI's embedding API is used to generate embeddings for each document using the "text-embedding-ada-002" model.
- The embeddings are stored in a Pinecone namespace.
- Web pages are scraped usingcheerio, cleaned to plain text, and split into 1000-character documents.
- OpenAI's embedding API is used to generate embeddings for each document using the "text-embedding-ada-002" model.
- The embeddings are stored in a Pinecone namespace.
- A single embedding is generated from the user prompt.
- The embedding is used to perform a similarity search against the vector database.
- The results of the similarity search are used to construct a prompt for GPT-3.
- The GTP-3 response is then streamed back to the user.
To create a new project based on this template usingdegit:
npx degit https://github.com/Jordan-Gilliam/ai-template ai-template
cd ai-templatecode.
- install dependencies
npm i
- Visitpinecone to create a free tier account and from the dashboard.
- Create a new Pinecone Index with Dimensions
1536
eg:
- Copy your API key
- Record your Environment name ex:
us-central1-gcp
- Record your index name ex:
mercury
- Visitopenai to create and copy your API key
You can find this in the OpenAI web portal under
API Keys
cp .env.example .env.local
# OpenAIOPENAI_API_KEY="sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"# PineconePINECONE_API_KEY="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx"PINECONE_ENVIRONMENT="us-central1-gcp"PINECONE_INDEX_NAME="mercury"
npm run dev
Openhttp://localhost:3000 in your browser to view the app.
- OpenAI API (for generating embeddings and GPT-3 responses)
- Pinecone
- Nextjs API Routes (Edge runtime) - streaming
- Tailwind CSS
- Fonts with
@next/font
- Icons fromLucide
- Dark mode with
next-themes
- Radix UI Primitives
- Automatic import sorting with
@ianvs/prettier-plugin-sort-imports
🍴 Huge thanks to@gannonh and@mayooear for their fantastic work that helped inspire this template.
- https://www.perplexity.ai/
- https://builtbyjesse.com/
- https://ui.shadcn.com/docs
- https://meodai.github.io/poline/
- https://github.com/gannonh/gpt3.5-turbo-pgvector
- https://github.com/vercel/examples/tree/main/solutions/ai-chatgpt
ChatGPT is a great tool for answering general questions, but it falls short when it comes to answering domain-specific questions as it often makes up answers to fill its knowledge gaps and doesn't cite sources. To solve this issue, this starter app uses embeddings coupled with vector search. This app shows how OpenAI's GPT-3 API can be used to create conversational interfaces for domain-specific knowledge.
Embeddings are vectors of floating-point numbers that represent the "relatedness" of text strings. They are very useful for tasks like ranking search results, clustering, and classification. In text embeddings, a high cosine similarity between two embedding vectors indicates that the corresponding text strings are highly related.
This app uses embeddings to generate a vector representation of a document and then uses vector search to find the most similar documents to the query. The results of the vector search are then used to construct a prompt for GPT-3, which generates a response. The response is then streamed back to the user.
About
Mercury - Train your own custom GPT. Chat with any file, or website.
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Languages
- TypeScript96.2%
- CSS2.3%
- JavaScript1.5%