Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Document search component to aid RAG

License

NotificationsYou must be signed in to change notification settings

get-convex/rag

Repository files navigation

npm version

A component for semantic search, usually used to look up context for LLMs. Usewith an Agent for Retrieval-Augmented Generation (RAG).

Use AI to search HUGE amounts of text with the RAG Component

✨ Key Features

  • Add Content: Add or replace content with text chunks and embeddings.
  • Semantic Search: Vector-based search using configurable embedding models
  • Namespaces: Organize content into namespaces for per-user search.
  • Custom Filtering: Filter content with custom indexed fields.
  • Importance Weighting: Weight content by providing a 0 to 1 "importance".
  • Chunk Context: Get surrounding chunks for better context.
  • Graceful Migrations: Migrate content or whole namespaces withoutdisruption.

Found a bug? Feature request?File it here.

Installation

Create aconvex.config.ts file in your app'sconvex/ folder and install thecomponent by callinguse:

// convex/convex.config.tsimport{defineApp}from"convex/server";importragfrom"@convex-dev/rag/convex.config.js";constapp=defineApp();app.use(rag);exportdefaultapp;

Runnpx convex codegen ifnpx convex dev isn't already running.

Basic Setup

// convex/example.tsimport{components}from"./_generated/api";import{RAG}from"@convex-dev/rag";// Any AI SDK model that supports embeddings will work.import{openai}from"@ai-sdk/openai";constrag=newRAG(components.rag,{textEmbeddingModel:openai.embedding("text-embedding-3-small"),embeddingDimension:1536,// Needs to match your embedding model});

Add context to RAG

Add content with text chunks. Each call toadd will create a newentry. Itwill embed the chunks automatically if you don't provide them.

exportconstadd=action({args:{text:v.string()},handler:async(ctx,{ text})=>{// Add the text to a namespace shared by all users.awaitrag.add(ctx,{namespace:"all-users",      text,});},});

See below for how to chunk the text yourself or add content asynchronously, e.g.to handle large files.

Semantic Search

Search across content with vector similarity

  • text is a string with the full content of the results, for convenience. Itis in order of the entries, with titles at each entry boundary, and separatorsbetween non-sequential chunks. See below for more details.
  • results is an array of matching chunks with scores and more metadata.
  • entries is an array of the entries that matched the query. Each result has aentryId referencing one of these source entries.
  • usage contains embedding token usage information. Will be{ tokens: 0 } ifno embedding was performed (e.g. when passing pre-computed embeddings).
exportconstsearch=action({args:{query:v.string(),},handler:async(ctx,args)=>{const{ results, text, entries, usage}=awaitrag.search(ctx,{namespace:"global",query:args.query,limit:10,vectorScoreThreshold:0.5,// Only return results with a score >= 0.5});return{ results, text, entries, usage};},});

Generate a response based on RAG context

Once you have searched for the context, you can use it with an LLM.

Generally you'll already be using something to make LLM requests, e.g. theAgent Component, which tracks themessage history for you. See theAgent Component docs for more details on doingRAG with the Agent Component.

However, if you just want a one-off response, you can use thegenerateTextfunction as a convenience.

This will automatically search for relevant entries and use them as context forthe LLM, using default formatting.

The arguments togenerateText are compatible with all arguments togenerateText from the AI SDK.

exportconstaskQuestion=action({args:{prompt:v.string(),},handler:async(ctx,args)=>{constuserId=awaitgetAuthUserId(ctx);const{ text, context}=awaitrag.generateText(ctx,{search:{namespace:userId,limit:10},prompt:args.prompt,model:openai.chat("gpt-4o-mini"),});return{answer:text, context};},

Note: You can specify any of the search options available onrag.search.

Filtered Search

You can provide filters when adding content and use them to search. To do this,you'll need to give the RAG component a list of the filter names. You canoptionally provide a type parameter for type safety (no runtime validation).

Note: these filters can be OR'd together when searching. In order to get an AND,you provide a filter with a more complex value, such ascategoryAndType below.

// convex/example.tsimport{components}from"./_generated/api";import{RAG}from"@convex-dev/rag";// Any AI SDK model that supports embeddings will work.import{openai}from"@ai-sdk/openai";// Optional: Add type safety to your filters.typeFilterTypes={category:string;contentType:string;categoryAndType:{category:string;contentType:string};};constrag=newRAG<FilterTypes>(components.rag,{textEmbeddingModel:openai.embedding("text-embedding-3-small"),embeddingDimension:1536,// Needs to match your embedding modelfilterNames:["category","contentType","categoryAndType"],});

Adding content with filters:

awaitrag.add(ctx,{namespace:"global",  text,filterValues:[{name:"category",value:"news"},{name:"contentType",value:"article"},{name:"categoryAndType",value:{category:"news",contentType:"article"},},],});

Search with metadata filters:

exportconstsearchForNewsOrSports=action({args:{query:v.string(),},handler:async(ctx,args)=>{constuserId=awaitgetUserId(ctx);if(!userId)thrownewError("Unauthorized");constresults=awaitrag.search(ctx,{namespace:userId,query:args.query,filters:[{name:"category",value:"news"},{name:"category",value:"sports"},],limit:10,});returnresults;},});

Add surrounding chunks to results for context

Instead of getting just the single matching chunk, you can request surroundingchunks so there's more context to the result.

Note: If there are results that have overlapping ranges, it will not returnduplicate chunks, but instead give priority to adding the "before" context toeach chunk. For example if you requested 2 before and 1 after, and your resultswere for the same entryId indexes 1, 4, and 7, the results would be:

[// Only one before chunk available, and leaves chunk2 for the next result.{order:1,content:[chunk0,chunk1],startOrder:0, ...},// 2 before chunks available, but leaves chunk5 for the next result.{order:4,content:[chunk2,chunk3,chunk4],startOrder:2, ...},// 2 before chunks available, and includes one after chunk.{order:7,content:[chunk5,chunk6,chunk7,chunk8],startOrder:5, ...},]
exportconstsearchWithContext=action({args:{query:v.string(),userId:v.string(),},handler:async(ctx,args)=>{const{ results, text, entries, usage}=awaitrag.search(ctx,{namespace:args.userId,query:args.query,chunkContext:{before:2,after:1},// Include 2 chunks before, 1 afterlimit:5,});return{ results, text, entries, usage};},});

Formatting results

Formatting the results for use in a prompt depends a bit on the use case. Bydefault, the results will be sorted by score, not necessarily in the order theyappear in the original text. You may want to sort them by the order they appearin the original text so they follow the flow of the original document.

For convenience, thetext field of the search results is a string formattedwith... separating non-sequential chunks,--- separating entries, and# Title: at each entry boundary (if titles are available).

const{ text}=awaitrag.search(ctx,{ ...});console.log(text);
## Title 1:Chunk 1 contentsChunk 2 contents...Chunk 8 contentsChunk 9 contents---## Title 2:Chunk 4 contentsChunk 5 contents

There is also atext field on each entry that is the full text of the entry,similarly formatted with... separating non-sequential chunks, if you want toformat each entry differently.

For a fully custom format, you can use theresults field and entries directly:

const{ results, text, entries}=awaitrag.search(ctx,{namespace:args.userId,query:args.query,chunkContext:{before:2,after:1},// Include 2 chunks before, 1 afterlimit:5,vectorScoreThreshold:0.5,// Only return results with a score >= 0.5});// Get results in the order of the entries (highest score first)constcontexts=entries.map((e)=>{constranges=results.filter((r)=>r.entryId===e.entryId).sort((a,b)=>a.startOrder-b.startOrder);lettext=(e.title??"")+":\n\n";letpreviousEnd=0;for(constrangeofranges){if(range.startOrder!==previousEnd){text+="\n...\n";}text+=range.content.map((c)=>c.text).join("\n");previousEnd=range.startOrder+range.content.length;}return{      ...e,entryId:e.entryIdasEntryId,filterValues:e.filterValuesasEntryFilterValues<FitlerSchemas>[],      text,};}).map((e)=>(e.title ?`#${e.title}:\n${e.text}` :e.text));awaitgenerateText({model:openai.chat("gpt-4o-mini"),prompt:"Use the following context:\n\n"+contexts.join("\n---\n")+"\n\n---\n\n Based on the context, answer the question:\n\n"+args.query,});

Using keys to gracefully replace content

When you add content to a namespace, you can provide akey to uniquelyidentify the content. If you add content with the same key, it will make a newentry to replace the old one.

awaitrag.add(ctx,{namespace:userId,key:"my-file.txt", text});

When a new document is added, it will start with a status of "pending" while itchunks, embeds, and inserts the data into the database. Once all data isinserted, it will iterate over the chunks and swap the old content embeddingswith the new ones, and then update the status to "ready", marking the previousversion as "replaced".

The old content is kept around by default, so in-flight searches will getresults for old vector search results. See below for more details on deleting.

This means that if searches are happening while the document is being added,they will see the old content results This is useful if you want to add contentto a namespace and then immediately search for it, or if you want to add contentto a namespace and then immediately add more content to the same namespace.

Using your own content splitter

By default, the component uses thedefaultChunker to split the content intochunks. You can pass in your own content chunks to theadd oraddAsyncfunctions.

constchunks=awaittextSplitter.split(content);awaitrag.add(ctx,{namespace:"global", chunks});

Note: ThetextSplitter here could be LangChain, Mastra, or something custom.The simplest version makes an array of strings likecontent.split("\n").

Note: you can pass in an async iterator instead of an array to handle largecontent. Or use theaddAsync function (see below).

Providing custom embeddings per-chunk

In addition to the text, you can provide your own embeddings for each chunk.

This can be beneficial if you want to embed something other than the chunkcontents, e.g. a summary of each chunk.

constchunks=awaittextSplitter.split(content);constchunksWithEmbeddings=awaitPromise.all(chunks.map(async(chunk)=>{return{      ...chunk,embedding:awaitembedSummary(chunk),};}),);awaitrag.add(ctx,{namespace:"global", chunks});

Add Entries Asynchronously using File Storage

For large files, you can upload them to file storage, then provide a chunkeraction to split them into chunks.

Inconvex/http.ts:

import{corsRouter}from"convex-helpers/server/cors";import{httpRouter}from"convex/server";import{internal}from"./_generated/api.js";import{DataModel}from"./_generated/dataModel.js";import{httpAction}from"./_generated/server.js";import{rag}from"./example.js";constcors=corsRouter(httpRouter());cors.route({path:"/upload",method:"POST",handler:httpAction(async(ctx,request)=>{conststorageId=awaitctx.storage.store(awaitrequest.blob());awaitrag.addAsync(ctx,{namespace:"all-files",chunkerAction:internal.http.chunkerAction,onComplete:internal.foo.docComplete,// See next sectionmetadata:{ storageId},});returnnewResponse();}),});exportconstchunkerAction=rag.defineChunkerAction(async(ctx,args)=>{conststorageId=args.entry.metadata!.storageId;constfile=awaitctx.storage.get(storageId);consttext=awaitnewTextDecoder().decode(awaitfile!.arrayBuffer());return{chunks:text.split("\n\n")};});exportdefaultcors.http;

You can upload files directly to a Convex action, httpAction, or upload url. Seethedocs for details.

OnComplete Handling

You can register anonComplete handler when adding content that will be calledwhen the entry was created and is ready, or if there was an error or it wasreplaced before it finished.

// in an actionawaitrag.add(ctx,{ namespace, text,onComplete:internal.foo.docComplete});// in convex/foo.tsexportconstdocComplete=rag.defineOnComplete<DataModel>(async(ctx,{ replacedEntry, entry, namespace, error})=>{if(error){awaitrag.delete(ctx,{entryId:entry.entryId});return;}if(replacedEntry){awaitrag.delete(ctx,{entryId:replacedEntry.entryId});}// You can associate the entry with your own data here. This will commit// in the same transaction as the entry becoming ready.},);

Note: TheonComplete callback is only triggered when new content is processed.If you add content that already exists (contentHash did not change for thesamekey),onComplete will not be called. To handle this case, you can checkthe return value ofrag.add():

const{ status, created}=awaitrag.add(ctx,{  namespace,  text,key:"my-key",contentHash:"...",onComplete:internal.foo.docComplete,});if(status==="ready"&&!created){// Entry already existed - onComplete will not be called// Handle this case if needed}

Add Entries with filters from a URL

Here's a simple example fetching content from a URL to add.

It also adds filters to the entry, so you can search for it later by category,contentType, or both.

exportconstadd=action({args:{url:v.string(),category:v.string()},handler:async(ctx,{ url, category})=>{constresponse=awaitfetch(url);constcontent=awaitresponse.text();constcontentType=response.headers.get("content-type");const{ entryId}=awaitrag.add(ctx,{namespace:"global",// namespace can be any stringkey:url,chunks:content.split("\n\n"),filterValues:[{name:"category",value:category},{name:"contentType",value:contentType},// To get an AND filter, use a filter with a more complex value.{name:"categoryAndType",value:{ category, contentType}},],});return{ entryId};},});

Lifecycle Management

You can delete the old content by callingrag.delete with the entryId of theold version.

Generally you'd do this:

  1. When usingrag.add with a key returns areplacedEntry.
  2. When youronComplete handler provides a non-nullreplacedEntry argument.
  3. Periodically by querying:
// in convex/crons.tsimport{cronJobs}from"convex/server";import{internal}from"./_generated/api.js";import{internalMutation}from"./_generated/server.js";import{v}from"convex/values";import{rag}from"./example.js";import{assert}from"convex-helpers";constWEEK=7*24*60*60*1000;exportconstdeleteOldContent=internalMutation({args:{cursor:v.optional(v.string())},handler:async(ctx,args)=>{consttoDelete=awaitrag.list(ctx,{status:"replaced",paginationOpts:{cursor:args.cursor??null,numItems:100},});for(constentryoftoDelete.page){assert(entry.status==="replaced");if(entry.replacedAt>=Date.now()-WEEK){return;// we're done when we catch up to a week ago}awaitrag.delete(ctx,{entryId:entry.entryId});}if(!toDelete.isDone){awaitctx.scheduler.runAfter(0,internal.example.deleteOldContent,{cursor:toDelete.continueCursor,});}},});// See example/convex/crons.ts for a complete example.constcrons=cronJobs();crons.interval("deleteOldContent",{hours:1},internal.crons.deleteOldContent,{},);exportdefaultcrons;

Working with types

You can use the provided types to validate and store data.import { ... } from "@convex-dev/rag";

Types for the various elements:

Entry,EntryFilter,SearchEntry,SearchResult

  • SearchEntry is anEntry with atext field including the combined searchresults for that entry, whereas aSearchResult is a specific chunk result,along with surrounding chunks.

EntryId,NamespaceId

  • While theEntryId andNamespaceId are strings under the hood, they aregiven more specific types to make it easier to use them correctly.

Validators can be used inargs and schema table definitions:vEntry,vEntryId,vNamespaceId,vSearchEntry,vSearchResult

e.g.defineTable({ myDocTitle: v.string(), entryId: vEntryId })

The validators for the branded IDs will only validate they are strings, but willhave the more specific types, to provide type safety.

Utility Functions

In addition to the function on therag instance, there are other utilitiesprovided:

defaultChunker

This is the default chunker used by theadd andaddAsync functions.

It is customizable, but by default:

  • It tries to break up the text into paragraphs between 100-1k characters.
  • It will combine paragraphs to meet the minimum character count (100).
  • It will break up paragraphs into separate lines to keep it under 1k.
  • It will not split up a single line unless it's longer than 10k characters.
import{defaultChunker}from"@convex-dev/rag";constchunks=defaultChunker(text,{// these are the defaultsminLines:1,minCharsSoftLimit:100,maxCharsSoftLimit:1000,maxCharsHardLimit:10000,delimiter:"\n\n",});

hybridRank

This is an implementation of "Reciprocal Rank Fusion" for ranking search resultsbased on multiple scoring arrays. The premise is that if both arrays of resultsare sorted by score, the best results show up near the top of both arrays andshould be preferred over results higher in one but much lower in the other.

import{hybridRank}from"@convex-dev/rag";consttextSearchResults=[id1,id2,id3];constvectorSearchResults=[id2,id3,id1];constresults=hybridRank([textSearchResults,vectorSearchResults]);// results = [id2, id1, id3]

It can take more than two arrays, and you can provide weights for each array.

constrecentSearchResults=[id5,id4,id3];constresults=hybridRank([textSearchResults,vectorSearchResults,recentSearchResults],{weights:[2,1,3],// prefer recent results more than text or vector},);// results = [ id3, id5, id1, id2, id4 ]

To have it more biased towards the top few results, you can set thek value toa lower number (10 by default).

constresults=hybridRank([textSearchResults,vectorSearchResults,recentSearchResults],{k:1},);// results = [ id5, id1, id3, id2, id4 ]

contentHashFromArrayBuffer

This generates the hash of a file's contents, which can be used to avoid addingthe same file twice.

Note: doingblob.arrayBuffer() will consume the blob's data, so you'll need tomake a new blob to use it after calling this function.

import{contentHashFromArrayBuffer}from"@convex-dev/rag";exportconstaddFile=action({args:{bytes:v.bytes()},handler:async(ctx,{ bytes})=>{consthash=awaitcontentHashFromArrayBuffer(bytes);constexisting=awaitrag.findEntryByContentHash(ctx,{namespace:"global",key:"my-file.txt",contentHash:hash,});if(existing){console.log("File contents are the same, skipping");return;}constblob=newBlob([bytes],{type:"text/plain"});//...},});

guessMimeTypeFromExtension

This guesses the mime type of a file from its extension.

import{guessMimeTypeFromExtension}from"@convex-dev/rag";constmimeType=guessMimeTypeFromExtension("my-file.mjs");console.log(mimeType);// "text/javascript"

guessMimeTypeFromContents

This guesses the mime type of a file from the first few bytes of its contents.

import{guessMimeTypeFromContents}from"@convex-dev/rag";constmimeType=guessMimeTypeFromContents(awaitfile.arrayBuffer());

Example Usage

See more example usage inexample.ts.

Running the example

Run the example withnpm run setup && npm run dev.

About

Document search component to aid RAG

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp