Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

A lightweight implementation of Kernel Memory as a Service

License

NotificationsYou must be signed in to change notification settings

marcominerva/KernelMemoryService

Repository files navigation

Kernel Memory provides aService implementation that can be used to manage memory settings, ingest data and query for answers. While it is a good solution, in some scenarios it can be too complex.

So, the goal of this repository is to provide a lightweight implementation of Kernel Memory as a Service. This project is quite simple, so it can be easily customized according to your needs and even integrated in existing application with little effort.

How to use

The service can be directly configured in theProgram.cs file. The default implementation uses the following settings:

  • Azure OpenAI Service for embeddings and text generation.
  • File system for Content Storage, Vector Storage and Orchestration.

The configuration values are stored in theappsettings.json file.

You can easily change all these options by using any of thesupported backends.

Conversational support

Embeddings are generated based on a given text. So, in a conversational scenario, it is necessary to keep track of the previous messages in order to generate valid embeddings for a particular question.

For example, suppose we have imported a couple of Wikipedia articles, one aboutTaggia and the other aboutSanremo, two cities in Italy. Now, we want to ask questions about them (of course, this information is publicly available and known by GPT models, so using embeddings and RAG aren't really necessary, but this is just an example). So, we start with the following:

  • How many people live in Taggia?

Using embeddings and RAG, Kernel Memory will generate the correct answer. Now, as we are in a chat context, we ask another question:

  • And in Sanremo?

From our point of view, this question is the "continuation" of the chat, so it means "And how many people live in Sanremo?", However, if we directly generate embeddings for "And in Sanremo?", they won't contain anything about the fact we are interested in the population number, so we won't get any result.

To solve this problem, we need to keep track of the previous messages and, when asking a question, we need to reformulate it taking into account the whole conversation. In this way, we can generate the correct embeddings.

The Service automatically handles this scenario by using a Memory Cache and aConversationId associated to each question. Questions and answers are kept in memory, so the Service is able toreformulate questions based on the current chat context before using Kernel Memory.

NoteThis isn't the only way to keep track of the conversation context. The Service uses an explicit approach to make it clear how the workflow should work.

Two settings inappsettings.json file are used to configure the cache:

  • MessageLimit: specifies how many messages for each conversation must be saved. When this limit is reached, oldest messages are automatically removed.
  • MessageExpiration: specifies the time interval used to maintain messages in cache, regardless their count.

[8]ページ先頭

©2009-2025 Movatter.jp