- Notifications
You must be signed in to change notification settings - Fork8
Document search component to aid RAG
License
get-convex/rag
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
A component for semantic search, usually used to look up context for LLMs. Usewith an Agent for Retrieval-Augmented Generation (RAG).
- Add Content: Add or replace content with text chunks and embeddings.
- Semantic Search: Vector-based search using configurable embedding models
- Namespaces: Organize content into namespaces for per-user search.
- Custom Filtering: Filter content with custom indexed fields.
- Importance Weighting: Weight content by providing a 0 to 1 "importance".
- Chunk Context: Get surrounding chunks for better context.
- Graceful Migrations: Migrate content or whole namespaces withoutdisruption.
Found a bug? Feature request?File it here.
Create aconvex.config.ts file in your app'sconvex/ folder and install thecomponent by callinguse:
// convex/convex.config.tsimport{defineApp}from"convex/server";importragfrom"@convex-dev/rag/convex.config.js";constapp=defineApp();app.use(rag);exportdefaultapp;
Runnpx convex codegen ifnpx convex dev isn't already running.
// convex/example.tsimport{components}from"./_generated/api";import{RAG}from"@convex-dev/rag";// Any AI SDK model that supports embeddings will work.import{openai}from"@ai-sdk/openai";constrag=newRAG(components.rag,{textEmbeddingModel:openai.embedding("text-embedding-3-small"),embeddingDimension:1536,// Needs to match your embedding model});
Add content with text chunks. Each call toadd will create a newentry. Itwill embed the chunks automatically if you don't provide them.
exportconstadd=action({args:{text:v.string()},handler:async(ctx,{ text})=>{// Add the text to a namespace shared by all users.awaitrag.add(ctx,{namespace:"all-users", text,});},});
See below for how to chunk the text yourself or add content asynchronously, e.g.to handle large files.
Search across content with vector similarity
textis a string with the full content of the results, for convenience. Itis in order of the entries, with titles at each entry boundary, and separatorsbetween non-sequential chunks. See below for more details.resultsis an array of matching chunks with scores and more metadata.entriesis an array of the entries that matched the query. Each result has aentryIdreferencing one of these source entries.usagecontains embedding token usage information. Will be{ tokens: 0 }ifno embedding was performed (e.g. when passing pre-computed embeddings).
exportconstsearch=action({args:{query:v.string(),},handler:async(ctx,args)=>{const{ results, text, entries, usage}=awaitrag.search(ctx,{namespace:"global",query:args.query,limit:10,vectorScoreThreshold:0.5,// Only return results with a score >= 0.5});return{ results, text, entries, usage};},});
Once you have searched for the context, you can use it with an LLM.
Generally you'll already be using something to make LLM requests, e.g. theAgent Component, which tracks themessage history for you. See theAgent Component docs for more details on doingRAG with the Agent Component.
However, if you just want a one-off response, you can use thegenerateTextfunction as a convenience.
This will automatically search for relevant entries and use them as context forthe LLM, using default formatting.
The arguments togenerateText are compatible with all arguments togenerateText from the AI SDK.
exportconstaskQuestion=action({args:{prompt:v.string(),},handler:async(ctx,args)=>{constuserId=awaitgetAuthUserId(ctx);const{ text, context}=awaitrag.generateText(ctx,{search:{namespace:userId,limit:10},prompt:args.prompt,model:openai.chat("gpt-4o-mini"),});return{answer:text, context};},
Note: You can specify any of the search options available onrag.search.
You can provide filters when adding content and use them to search. To do this,you'll need to give the RAG component a list of the filter names. You canoptionally provide a type parameter for type safety (no runtime validation).
Note: these filters can be OR'd together when searching. In order to get an AND,you provide a filter with a more complex value, such ascategoryAndType below.
// convex/example.tsimport{components}from"./_generated/api";import{RAG}from"@convex-dev/rag";// Any AI SDK model that supports embeddings will work.import{openai}from"@ai-sdk/openai";// Optional: Add type safety to your filters.typeFilterTypes={category:string;contentType:string;categoryAndType:{category:string;contentType:string};};constrag=newRAG<FilterTypes>(components.rag,{textEmbeddingModel:openai.embedding("text-embedding-3-small"),embeddingDimension:1536,// Needs to match your embedding modelfilterNames:["category","contentType","categoryAndType"],});
Adding content with filters:
awaitrag.add(ctx,{namespace:"global", text,filterValues:[{name:"category",value:"news"},{name:"contentType",value:"article"},{name:"categoryAndType",value:{category:"news",contentType:"article"},},],});
Search with metadata filters:
exportconstsearchForNewsOrSports=action({args:{query:v.string(),},handler:async(ctx,args)=>{constuserId=awaitgetUserId(ctx);if(!userId)thrownewError("Unauthorized");constresults=awaitrag.search(ctx,{namespace:userId,query:args.query,filters:[{name:"category",value:"news"},{name:"category",value:"sports"},],limit:10,});returnresults;},});
Instead of getting just the single matching chunk, you can request surroundingchunks so there's more context to the result.
Note: If there are results that have overlapping ranges, it will not returnduplicate chunks, but instead give priority to adding the "before" context toeach chunk. For example if you requested 2 before and 1 after, and your resultswere for the same entryId indexes 1, 4, and 7, the results would be:
[// Only one before chunk available, and leaves chunk2 for the next result.{order:1,content:[chunk0,chunk1],startOrder:0, ...},// 2 before chunks available, but leaves chunk5 for the next result.{order:4,content:[chunk2,chunk3,chunk4],startOrder:2, ...},// 2 before chunks available, and includes one after chunk.{order:7,content:[chunk5,chunk6,chunk7,chunk8],startOrder:5, ...},]
exportconstsearchWithContext=action({args:{query:v.string(),userId:v.string(),},handler:async(ctx,args)=>{const{ results, text, entries, usage}=awaitrag.search(ctx,{namespace:args.userId,query:args.query,chunkContext:{before:2,after:1},// Include 2 chunks before, 1 afterlimit:5,});return{ results, text, entries, usage};},});
Formatting the results for use in a prompt depends a bit on the use case. Bydefault, the results will be sorted by score, not necessarily in the order theyappear in the original text. You may want to sort them by the order they appearin the original text so they follow the flow of the original document.
For convenience, thetext field of the search results is a string formattedwith... separating non-sequential chunks,--- separating entries, and# Title: at each entry boundary (if titles are available).
const{ text}=awaitrag.search(ctx,{ ...});console.log(text);
## Title 1:Chunk 1 contentsChunk 2 contents...Chunk 8 contentsChunk 9 contents---## Title 2:Chunk 4 contentsChunk 5 contents
There is also atext field on each entry that is the full text of the entry,similarly formatted with... separating non-sequential chunks, if you want toformat each entry differently.
For a fully custom format, you can use theresults field and entries directly:
const{ results, text, entries}=awaitrag.search(ctx,{namespace:args.userId,query:args.query,chunkContext:{before:2,after:1},// Include 2 chunks before, 1 afterlimit:5,vectorScoreThreshold:0.5,// Only return results with a score >= 0.5});// Get results in the order of the entries (highest score first)constcontexts=entries.map((e)=>{constranges=results.filter((r)=>r.entryId===e.entryId).sort((a,b)=>a.startOrder-b.startOrder);lettext=(e.title??"")+":\n\n";letpreviousEnd=0;for(constrangeofranges){if(range.startOrder!==previousEnd){text+="\n...\n";}text+=range.content.map((c)=>c.text).join("\n");previousEnd=range.startOrder+range.content.length;}return{ ...e,entryId:e.entryIdasEntryId,filterValues:e.filterValuesasEntryFilterValues<FitlerSchemas>[], text,};}).map((e)=>(e.title ?`#${e.title}:\n${e.text}` :e.text));awaitgenerateText({model:openai.chat("gpt-4o-mini"),prompt:"Use the following context:\n\n"+contexts.join("\n---\n")+"\n\n---\n\n Based on the context, answer the question:\n\n"+args.query,});
When you add content to a namespace, you can provide akey to uniquelyidentify the content. If you add content with the same key, it will make a newentry to replace the old one.
awaitrag.add(ctx,{namespace:userId,key:"my-file.txt", text});
When a new document is added, it will start with a status of "pending" while itchunks, embeds, and inserts the data into the database. Once all data isinserted, it will iterate over the chunks and swap the old content embeddingswith the new ones, and then update the status to "ready", marking the previousversion as "replaced".
The old content is kept around by default, so in-flight searches will getresults for old vector search results. See below for more details on deleting.
This means that if searches are happening while the document is being added,they will see the old content results This is useful if you want to add contentto a namespace and then immediately search for it, or if you want to add contentto a namespace and then immediately add more content to the same namespace.
By default, the component uses thedefaultChunker to split the content intochunks. You can pass in your own content chunks to theadd oraddAsyncfunctions.
constchunks=awaittextSplitter.split(content);awaitrag.add(ctx,{namespace:"global", chunks});
Note: ThetextSplitter here could be LangChain, Mastra, or something custom.The simplest version makes an array of strings likecontent.split("\n").
Note: you can pass in an async iterator instead of an array to handle largecontent. Or use theaddAsync function (see below).
In addition to the text, you can provide your own embeddings for each chunk.
This can be beneficial if you want to embed something other than the chunkcontents, e.g. a summary of each chunk.
constchunks=awaittextSplitter.split(content);constchunksWithEmbeddings=awaitPromise.all(chunks.map(async(chunk)=>{return{ ...chunk,embedding:awaitembedSummary(chunk),};}),);awaitrag.add(ctx,{namespace:"global", chunks});
For large files, you can upload them to file storage, then provide a chunkeraction to split them into chunks.
Inconvex/http.ts:
import{corsRouter}from"convex-helpers/server/cors";import{httpRouter}from"convex/server";import{internal}from"./_generated/api.js";import{DataModel}from"./_generated/dataModel.js";import{httpAction}from"./_generated/server.js";import{rag}from"./example.js";constcors=corsRouter(httpRouter());cors.route({path:"/upload",method:"POST",handler:httpAction(async(ctx,request)=>{conststorageId=awaitctx.storage.store(awaitrequest.blob());awaitrag.addAsync(ctx,{namespace:"all-files",chunkerAction:internal.http.chunkerAction,onComplete:internal.foo.docComplete,// See next sectionmetadata:{ storageId},});returnnewResponse();}),});exportconstchunkerAction=rag.defineChunkerAction(async(ctx,args)=>{conststorageId=args.entry.metadata!.storageId;constfile=awaitctx.storage.get(storageId);consttext=awaitnewTextDecoder().decode(awaitfile!.arrayBuffer());return{chunks:text.split("\n\n")};});exportdefaultcors.http;
You can upload files directly to a Convex action, httpAction, or upload url. Seethedocs for details.
You can register anonComplete handler when adding content that will be calledwhen the entry was created and is ready, or if there was an error or it wasreplaced before it finished.
// in an actionawaitrag.add(ctx,{ namespace, text,onComplete:internal.foo.docComplete});// in convex/foo.tsexportconstdocComplete=rag.defineOnComplete<DataModel>(async(ctx,{ replacedEntry, entry, namespace, error})=>{if(error){awaitrag.delete(ctx,{entryId:entry.entryId});return;}if(replacedEntry){awaitrag.delete(ctx,{entryId:replacedEntry.entryId});}// You can associate the entry with your own data here. This will commit// in the same transaction as the entry becoming ready.},);
Note: TheonComplete callback is only triggered when new content is processed.If you add content that already exists (contentHash did not change for thesamekey),onComplete will not be called. To handle this case, you can checkthe return value ofrag.add():
const{ status, created}=awaitrag.add(ctx,{ namespace, text,key:"my-key",contentHash:"...",onComplete:internal.foo.docComplete,});if(status==="ready"&&!created){// Entry already existed - onComplete will not be called// Handle this case if needed}
Here's a simple example fetching content from a URL to add.
It also adds filters to the entry, so you can search for it later by category,contentType, or both.
exportconstadd=action({args:{url:v.string(),category:v.string()},handler:async(ctx,{ url, category})=>{constresponse=awaitfetch(url);constcontent=awaitresponse.text();constcontentType=response.headers.get("content-type");const{ entryId}=awaitrag.add(ctx,{namespace:"global",// namespace can be any stringkey:url,chunks:content.split("\n\n"),filterValues:[{name:"category",value:category},{name:"contentType",value:contentType},// To get an AND filter, use a filter with a more complex value.{name:"categoryAndType",value:{ category, contentType}},],});return{ entryId};},});
You can delete the old content by callingrag.delete with the entryId of theold version.
Generally you'd do this:
- When using
rag.addwith a key returns areplacedEntry. - When your
onCompletehandler provides a non-nullreplacedEntryargument. - Periodically by querying:
// in convex/crons.tsimport{cronJobs}from"convex/server";import{internal}from"./_generated/api.js";import{internalMutation}from"./_generated/server.js";import{v}from"convex/values";import{rag}from"./example.js";import{assert}from"convex-helpers";constWEEK=7*24*60*60*1000;exportconstdeleteOldContent=internalMutation({args:{cursor:v.optional(v.string())},handler:async(ctx,args)=>{consttoDelete=awaitrag.list(ctx,{status:"replaced",paginationOpts:{cursor:args.cursor??null,numItems:100},});for(constentryoftoDelete.page){assert(entry.status==="replaced");if(entry.replacedAt>=Date.now()-WEEK){return;// we're done when we catch up to a week ago}awaitrag.delete(ctx,{entryId:entry.entryId});}if(!toDelete.isDone){awaitctx.scheduler.runAfter(0,internal.example.deleteOldContent,{cursor:toDelete.continueCursor,});}},});// See example/convex/crons.ts for a complete example.constcrons=cronJobs();crons.interval("deleteOldContent",{hours:1},internal.crons.deleteOldContent,{},);exportdefaultcrons;
You can use the provided types to validate and store data.import { ... } from "@convex-dev/rag";
Types for the various elements:
Entry,EntryFilter,SearchEntry,SearchResult
SearchEntryis anEntrywith atextfield including the combined searchresults for that entry, whereas aSearchResultis a specific chunk result,along with surrounding chunks.
EntryId,NamespaceId
- While the
EntryIdandNamespaceIdare strings under the hood, they aregiven more specific types to make it easier to use them correctly.
Validators can be used inargs and schema table definitions:vEntry,vEntryId,vNamespaceId,vSearchEntry,vSearchResult
e.g.defineTable({ myDocTitle: v.string(), entryId: vEntryId })
The validators for the branded IDs will only validate they are strings, but willhave the more specific types, to provide type safety.
In addition to the function on therag instance, there are other utilitiesprovided:
This is the default chunker used by theadd andaddAsync functions.
It is customizable, but by default:
- It tries to break up the text into paragraphs between 100-1k characters.
- It will combine paragraphs to meet the minimum character count (100).
- It will break up paragraphs into separate lines to keep it under 1k.
- It will not split up a single line unless it's longer than 10k characters.
import{defaultChunker}from"@convex-dev/rag";constchunks=defaultChunker(text,{// these are the defaultsminLines:1,minCharsSoftLimit:100,maxCharsSoftLimit:1000,maxCharsHardLimit:10000,delimiter:"\n\n",});
This is an implementation of "Reciprocal Rank Fusion" for ranking search resultsbased on multiple scoring arrays. The premise is that if both arrays of resultsare sorted by score, the best results show up near the top of both arrays andshould be preferred over results higher in one but much lower in the other.
import{hybridRank}from"@convex-dev/rag";consttextSearchResults=[id1,id2,id3];constvectorSearchResults=[id2,id3,id1];constresults=hybridRank([textSearchResults,vectorSearchResults]);// results = [id2, id1, id3]
It can take more than two arrays, and you can provide weights for each array.
constrecentSearchResults=[id5,id4,id3];constresults=hybridRank([textSearchResults,vectorSearchResults,recentSearchResults],{weights:[2,1,3],// prefer recent results more than text or vector},);// results = [ id3, id5, id1, id2, id4 ]
To have it more biased towards the top few results, you can set thek value toa lower number (10 by default).
constresults=hybridRank([textSearchResults,vectorSearchResults,recentSearchResults],{k:1},);// results = [ id5, id1, id3, id2, id4 ]
This generates the hash of a file's contents, which can be used to avoid addingthe same file twice.
Note: doingblob.arrayBuffer() will consume the blob's data, so you'll need tomake a new blob to use it after calling this function.
import{contentHashFromArrayBuffer}from"@convex-dev/rag";exportconstaddFile=action({args:{bytes:v.bytes()},handler:async(ctx,{ bytes})=>{consthash=awaitcontentHashFromArrayBuffer(bytes);constexisting=awaitrag.findEntryByContentHash(ctx,{namespace:"global",key:"my-file.txt",contentHash:hash,});if(existing){console.log("File contents are the same, skipping");return;}constblob=newBlob([bytes],{type:"text/plain"});//...},});
This guesses the mime type of a file from its extension.
import{guessMimeTypeFromExtension}from"@convex-dev/rag";constmimeType=guessMimeTypeFromExtension("my-file.mjs");console.log(mimeType);// "text/javascript"
This guesses the mime type of a file from the first few bytes of its contents.
import{guessMimeTypeFromContents}from"@convex-dev/rag";constmimeType=guessMimeTypeFromContents(awaitfile.arrayBuffer());
See more example usage inexample.ts.
Run the example withnpm run setup && npm run dev.
About
Document search component to aid RAG
Resources
License
Contributing
Uh oh!
There was an error while loading.Please reload this page.