Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Swift package for on-device Retrieval-Augmented Generation (RAG)

License

NotificationsYou must be signed in to change notification settings

rryam/LumoKit

Repository files navigation

LumoKit is a lightweight Swift library forRetrieval-Augmented Generation (RAG) systems. It integrates withPicoDocs for document parsing andVecturaKit for semantic search and vector storage.

The nameLumoKit is derived from the Chinese characters (liú) meaning "flow" and () meaning "model." It symbolizes the idea offlowing information through a model, reflecting data retrieval for a large language model.

Key Features

  • Parse and Chunk Documents: UsePicoDocs to extract content from files and split them into manageable chunks for efficient indexing.
  • Semantic Search: Perform similarity-based searches usingVecturaKit's vector database.
  • Configurable Document Indexing: Set custom chunk sizes to control how documents are segmented for retrieval.
  • Reset Database: Quickly reset the vector database to start fresh with new data.

Installation

Add the following dependencies to yourPackage.swift file:

dependencies:[.package(url:"https://github.com/rryam/LumoKit.git", from:"0.1.0"),],

Then import the package in your project:

import LumoKit

Usage

  1. Initialize LumoKit

First, set up the configuration for VecturaKit and initialize LumoKit:

import LumoKitimport VecturaKitletconfig=VecturaConfig(    name:"my-vector-db",    dimension:384,    searchOptions:VecturaConfig.SearchOptions(        defaultNumResults:10,        minThreshold:0.7))letlumoKit=tryLumoKit(config: config)
  1. Parse and Index Documents

Parse a file and index its content into the vector database:

letfileURL=URL(fileURLWithPath:"/path/to/your/document.pdf")tryawait lumoKit.parseAndIndex(url: fileURL, chunkSize:500)
  1. Perform Semantic Search

Search for relevant documents by querying the indexed database:

letresults=tryawait lumoKit.semanticSearch(query:"What is Swift?", numResults:5, threshold:0.7)forresultin results{print("Document ID:\(result.id)")print("Text:\(result.text)")print("Score:\(result.score)")}

How It Works

  • Document Parsing: Leverages PicoDocs to parse various file formats (e.g., PDF, Markdown).
  • Chunking: Splits the content into smaller chunks for efficient indexing.
  • Vector Storage: Uses VecturaKit to store embeddings and perform similarity searches.
  • Semantic Search: Retrieves the most relevant chunks for a given query.

Example Workflow

letfileURL=URL(fileURLWithPath:"/path/to/document.pdf")// Parse and index documenttryawait lumoKit.parseAndIndex(url: fileURL, chunkSize:500)// Perform semantic searchletquery="Explain the importance of vector databases."letresults=tryawait lumoKit.semanticSearch(query: query)forresultin results{print("Relevant Text:\(result.text)")}// Reset the databasetryawait lumoKit.resetDB()

Contributing

Contributions are welcome! Please fork the repository and submit a pull request with your improvements or suggestions.

License

LumoKit is licensed under the MIT License. See the LICENSE file for more details.

Acknowledgments

  • PicoDocs: For powerful document parsing.
  • VecturaKit: For robust vector database functionality.

About

Swift package for on-device Retrieval-Augmented Generation (RAG)

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp