Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Chatbot Image updates#1570

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
SilasMarvin merged 1 commit intomasterfromryan-chatbot-doc-image-updates
Jul 17, 2024
Merged
Show file tree
Hide file tree
Changes fromall commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
281 changes: 281 additions & 0 deletionspgml-cms/docs/.gitbook/assets/Chatbots_Flow-Diagram.svg
View file
Open in desktop
Loading
Sorry, something went wrong.Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
78 changes: 78 additions & 0 deletionspgml-cms/docs/.gitbook/assets/Chatbots_King-Diagram.svg
View file
Open in desktop
Loading
Sorry, something went wrong.Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
275 changes: 275 additions & 0 deletionspgml-cms/docs/.gitbook/assets/Chatbots_Limitations-Diagram.svg
View file
Open in desktop
Loading
Sorry, something went wrong.Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
238 changes: 238 additions & 0 deletionspgml-cms/docs/.gitbook/assets/Chatbots_Tokens-Diagram.svg
View file
Open in desktop
Loading
Sorry, something went wrong.Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removedpgml-cms/docs/.gitbook/assets/chatbot_flow.png
View file
Open in desktop
Binary file not shown.
Binary file removedpgml-cms/docs/.gitbook/assets/embedding_king.png
View file
Open in desktop
Binary file not shown.
View file
Open in desktop
Binary file not shown.
8 changes: 4 additions & 4 deletionspgml-cms/docs/guides/chatbots/README.md
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -30,7 +30,7 @@ Here is an example flowing from:

text -> tokens -> LLM -> probability distribution -> predicted token -> text

<figure><img src="https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FrvfCoPdoQeoovZiqNG90%2Fuploads%2FPzJzmVS3uNhbvseiJbgi%2FScreenshot%20from%202023-12-13%2013-19-33.png?alt=media&#x26;token=11d57b2a-6aa3-4374-b26c-afc6f531d2f3" alt=""><figcaption><p>The flow of inputs through an LLM. In this case the inputs are "What is Baldur's Gate 3?" and the output token "14" maps to the word "I"</p></figcaption></figure>
<figure><img src="../../.gitbook/assets/Chatbots_Limitations-Diagram.svg" alt=""><figcaption><p>The flow of inputs through an LLM. In this case the inputs are "What is Baldur's Gate 3?" and the output token "14" maps to the word "I"</p></figcaption></figure>

{% hint style="info" %}
We have simplified the tokenization process. Words do not always map directly to tokens. For instance, the word "Baldur's" may actually map to multiple tokens. For more information on tokenization checkout [HuggingFace's summary](https://huggingface.co/docs/transformers/tokenizer\_summary).
Expand DownExpand Up@@ -108,11 +108,11 @@ What does an `embedding` look like? `Embeddings` are just vectors (for our use c
embedding_1 = embed("King") # embed returns something like [0.11, -0.32, 0.46, ...]
```

<figure><img src="../../.gitbook/assets/embedding_king.png" alt=""><figcaption><p>The flow of word -> token -> embedding</p></figcaption></figure>
<figure><img src="../../.gitbook/assets/Chatbots_King-Diagram.svg" alt=""><figcaption><p>The flow of word -> token -> embedding</p></figcaption></figure>

`Embeddings` aren't limited to words, we have models that can embed entire sentences.

<figure><img src="../../.gitbook/assets/embeddings_tokens.png" alt=""><figcaption><p>The flow of sentence -> tokens -> embedding</p></figcaption></figure>
<figure><img src="../../.gitbook/assets/Chatbots_Tokens-Diagram.svg" alt=""><figcaption><p>The flow of sentence -> tokens -> embedding</p></figcaption></figure>

Why do we care about `embeddings`? `Embeddings` have a very interesting property. Words and sentences that have close [semantic similarity](https://en.wikipedia.org/wiki/Semantic\_similarity) sit closer to one another in vector space than words and sentences that do not have close semantic similarity.

Expand DownExpand Up@@ -157,7 +157,7 @@ print(context)

There is a lot going on with this, let's check out this diagram and step through it.

<figure><img src="../../.gitbook/assets/chatbot_flow.png" alt=""><figcaption><p>The flow of taking a document, splitting it into chunks, embedding those chunks, and then retrieving a chunk based off of a users query</p></figcaption></figure>
<figure><img src="../../.gitbook/assets/Chatbots_Flow-Diagram.svg" alt=""><figcaption><p>The flow of taking a document, splitting it into chunks, embedding those chunks, and then retrieving a chunk based off of a users query</p></figcaption></figure>

Step 1: We take the document and split it into chunks. Chunks are typically a paragraph or two in size. There are many ways to split documents into chunks, for more information check out [this guide](https://www.pinecone.io/learn/chunking-strategies/).

Expand Down

[8]ページ先頭

©2009-2025 Movatter.jp