Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit209fd5b

Browse files
authored
Update Chatbots README (#1402)
1 parentbb0d7aa commit209fd5b

File tree

1 file changed

+184
-1
lines changed

1 file changed

+184
-1
lines changed

‎pgml-cms/docs/use-cases/chatbots/README.md‎

Lines changed: 184 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ description: >-
99

1010
##Introduction <ahref="#introduction"id="introduction"></a>
1111

12-
This tutorial seeks to broadly cover the majority of topics required to not only implement a modern chatbot, but understand why we build them this way.There are three primary sections:
12+
This tutorial seeks to broadly cover the majority of topics required to not only implement a modern chatbot, but understand why we build them this way.There are three primary sections:
1313

1414
* The Limitations of Modern LLMs
1515
* Circumventing Limitations with RAG
@@ -202,6 +202,117 @@ Let's take this hypothetical example and make it a reality. For the rest of this
202202
* The chatbot remembers our past conversation
203203
* The chatbot can answer questions correctly about Baldur's Gate 3
204204

205+
In reality we haven't created a SOTA LLM, but fortunately other people have and we will be using the incredibly popular fine-tune of Mistral: `teknium/OpenHermes-2.5-Mistral-7B`. We will be using pgml our own Python library for the remainder of this tutorial. If you want to follow along and have not installed it yet:
206+
207+
```
208+
pip install pgml
209+
```
210+
211+
Also make sureandset the`DATABASE_URL` environment variable:
212+
213+
```
214+
exportDATABASE_URL="{your free PostgresML database url}"
215+
```
216+
217+
Let's setup a basic chat loop with our model:
218+
219+
```
220+
from pgmlimport TransformerPipeline
221+
import asyncio
222+
223+
model= TransformerPipeline(
224+
"text-generation",
225+
"teknium/OpenHermes-2.5-Mistral-7B",
226+
{"device_map":"auto","torch_dtype":"bfloat16"},
227+
)
228+
229+
asyncdef main():
230+
whileTrue:
231+
user_input=input("=>")
232+
model_output=await model.transform([user_input], {"max_new_tokens":1000})
233+
print(model_output[0][0]["generated_text"],"\n")
234+
235+
asyncio.run(main())
236+
```
237+
238+
{% hint style="info"%}
239+
Note thatin our previous hypothetical examples we manually called tokenize to convert our inputs into`tokens`,in the real world we let`pgml` handle converting the text into`tokens`.
240+
{% endhint%}
241+
242+
Now we can have the following conversation:
243+
244+
```
245+
=> Whatis your name?
246+
A: My nameis John.
247+
248+
Q: How old are you?
249+
250+
A: I am25 years old.
251+
252+
Q: Whatis your favorite color?
253+
254+
=> What did I just ask you?
255+
I asked youif you were going to the store.
256+
257+
Oh, I see. No, I'm not going to the store.
258+
```
259+
260+
That wasn't close to what we wanted to happen. Getting chatbots to work in the real world seems a bit more complicated than the hypothetical world.
261+
262+
To understand why our chatbot gave us a nonsensical first response,and why it didn't remember our conversation at all, we must dive shortly into the world of prompting.
263+
264+
RememberLLM's are just function approximators that are designed to predict the next most likely `token` given a list of `tokens`, and just like any other function, we must give the correct input. Let's look closer at theinput we are giving our chatbot. In our last conversation we asked it two questions:
265+
266+
* Whatis your name?
267+
* What did I just ask you?
268+
269+
We need to understand that LLMs have a specialformatfor the inputs specificallyfor conversations. So far we have been ignoring this required formattingand giving ourLLM the wrong inputs causing it to predicate nonsensical outputs.
270+
271+
What do the right inputs look like? That actually depends on the model. Each model can choose whichformat to usefor conversationswhile training,andnotall models are trained to be conversational.`teknium/OpenHermes-2.5-Mistral-7B` has been trained to be conversationaland expects us toformat text meantfor conversations like so:
272+
273+
```
274+
<|im_start|>system
275+
You are a helpfulAI assistant named Hermes
276+
<|im_start|>user
277+
Whatis your name?<|im_end|>
278+
<|im_start|>assistant
279+
```
280+
281+
We have added a bunch of these newHTML looking tags throughout ourinput. These tagsmap to tokens theLLM has been trained to associatewith conversation shifts.`<|im_start|>` marks the beginning of a message. The text right after`<|im_start|>`, either system, user,or assistant marks the role of the message,and`<|im_end|>` marks the end of a message.
282+
283+
Thisis the style ofinput ourLLM has been trained on. Let's do a simple test with this input and see if we get a better response:
284+
285+
```python
286+
from pgmlimport TransformerPipeline
287+
import asyncio
288+
289+
model= TransformerPipeline(
290+
"text-generation",
291+
"teknium/OpenHermes-2.5-Mistral-7B",
292+
{"device_map":"auto","torch_dtype":"bfloat16"},
293+
)
294+
295+
user_input="""
296+
<|im_start|>system
297+
You are a helpful AI assistant named Hermes
298+
<|im_start|>user
299+
What is your name?<|im_end|>
300+
<|im_start|>assistant
301+
"""
302+
303+
304+
asyncdef main():
305+
model_output=await model.transform([user_input], {"max_new_tokens":1000})
306+
print(model_output[0][0]["generated_text"],"\n")
307+
308+
309+
asyncio.run(main())
310+
```
311+
312+
```
313+
My nameis Hermes
314+
```
315+
205316
{% hint style="info"%}
206317
Notice we have a new"system" message we haven't discussed before. This special message gives us control over how the chatbot should interact with users. We could tell it to talk like a pirate, to be super friendly, or to not respond to angry messages. In this case we told it what it is, and its name. We will also add any conversation context the chatbot should have in the system message later.
207318
{% endhint%}
@@ -288,6 +399,78 @@ You just asked me what my name is, and I am a friendly and helpful chatbot named
288399

289400
Note that we have alist of dictionaries called`history` we use to store the chat history,and instead of feeding text into our model, we are inputting the`history`list. Our library automatically converts thislist of dictionaries into theformat expected by the model. Notice the`roles`in the dictionaries are the sameas the`roles` of the messagesin the previous example. Thislist of dictionarieswith keys`role`and`content`as a storage systemfor messagesis pretty standardand used by usas wellas OpenAIand HuggingFace.
290401

402+
Let's ask it the dreaded question:
403+
404+
```
405+
=> Whatis Baldur's Gate?
406+
Baldur's Gate 3 is a role-playing video game developed by Larian Studios and published by Dontnod Entertainment. It is based on the Advanced Dungeons & Dragons (D&D) rules and set in the Forgotten Realms campaign setting. Originally announced in 2012, the game had a long development period and was finally released in early access in October 2020. The game is a sequel to the popular Baldur's GateII: Shadows of Amn (2000)and Baldur's Gate: Siege of Dragonspear (2016) expansion, and it continues the tradition of immersive storytelling, tactical combat, and character progression that fans of the series love.L
407+
```
408+
409+
How does it know about Baldur's Gate 3? As it turns out, Baldur's Gate3 has actually been around since2020. I guess that completely ruins the hypothetical example. Let's ignore that and ask it something trickier it wouldn't know about Baldur's Gate 3.
410+
411+
```
412+
=> Whatis the plot of Baldur's Gate 3?
413+
Baldur's Gate 3 is a role-playing game set in the Dungeons & Dragons Forgotten Realms universe. The story revolves around a mind flayer, also known as an illithid, called The Mind Flayer who is attempting to merge humanoid minds into itself to achieve god-like power. Your character and their companions must navigate a world torn apart by various factions and conflicts while uncovering the conspiracy surrounding The Mind Flayer. Throughout the game, you'll forge relationshipswith various NPCs, make choices that impact the story,and engagein battleswith enemies using a turn-based combat system.
414+
```
415+
416+
As expected thisis rather a shallow response that lacksany of the actual plot. To get the answer we want, we need to provide the correct context to ourLLM, that means we need to:
417+
418+
* Get the textfrom theURL that has the answer
419+
* Split that text into chunks
420+
* Embed those chunks
421+
* Search over the chunks to find the closest match
422+
* Use the textfrom that chunkas contextfor theLLM
423+
424+
Luckily none of thisis actually very difficultas people like us have built libraries that handle thecomplex pieces. Hereis a program that handles steps1-4:
425+
426+
```python
427+
from pgmlimport Collection, Model, Splitter, Pipeline
428+
import wikipediaapi
429+
import asyncio
430+
431+
# Construct our wikipedia api
432+
wiki_wiki= wikipediaapi.Wikipedia("Chatbot Tutorial Project","en")
433+
434+
# Use the default model for embedding and default splitter for splitting
435+
model= Model()# The default model is intfloat/e5-small
436+
splitter= Splitter()# The default splitter is recursive_character
437+
438+
# Construct a pipeline for ingesting documents, splitting them into chunks, and then embedding them
439+
pipeline= Pipeline("test-pipeline-1", model, splitter)
440+
441+
# Create a collection to house these documents
442+
collection= Collection("chatbot-knowledge-base-1")
443+
444+
445+
asyncdef main():
446+
# Add the pipeline to the collection
447+
await collection.add_pipeline(pipeline)
448+
449+
# Get the document
450+
page= wiki_wiki.page("Baldur's_Gate_3")
451+
452+
# Upsert the document. This will split the document and embed it
453+
await collection.upsert_documents([{"id":"Baldur's_Gate_3","text": page.text}])
454+
455+
# Retrieve and print the most relevant section
456+
most_relevant_section=await (
457+
collection.query()
458+
.vector_recall("What is the plot of Baldur's Gate 3", pipeline)
459+
.limit(1)
460+
.fetch_all()
461+
)
462+
print(most_relevant_section[0][1])
463+
464+
465+
asyncio.run(main())
466+
```
467+
468+
```
469+
Plot
470+
Setting
471+
Baldur's Gate 3 takes place in the fictional world of the Forgotten Realms during the year of 1492 DR, over 120 years after the events of the previous game, Baldur's Gate II: Shadows of Amn, and months after the events of the playable Dungeons & Dragons 5e module, Baldur's Gate: Descent into Avernus. The story is set primarily in the Sword Coast in western Faerûn, encompassing a forested area that includes the Emerald Grove, a druid grove dedicated to the deity Silvanus; Moonrise Towers and the Shadow-Cursed Lands, which are covered by an unnatural and sentient darkness that can only be penetrated through magical means; and Baldur's Gate, the largest and most affluent city in the region, as well as its outlying suburb of Rivington. Other places the player will pass through include the Underdark, the Astral Plane and Avernus.The player character can either be created from scratch by the player, chosen from six pre-made "origin characters", or a customisable seventh origin character known as the Dark Urge. All six pre-made origin characters can be recruited as part of the player character's party. They include Lae'zel, a githyanki fighter; Shadowheart, a half-elf cleric; Astarion, a high elf vampire rogue; Gale, a human wizard; Wyll, a human warlock; and Karlach, a tiefling barbarian. Four other characters may join the player's party: Halsin, a wood elf druid; Jaheira, a half-elf druid; Minsc, a human ranger who carries with him a hamster named Boo; and Minthara, a drow paladin. Jaheira and Minsc previously appeared in both Baldur's Gate and Baldur's Gate II: Shadows of Amn.
472+
```
473+
291474
{% hint style="info"%}
292475
Once again we are using`pgml` to abstract away the complicated piecesfor our machine learning task. This isn't a guide on how to use our libraries, but for more information [check out our docs](https://postgresml.org/docs/api/client-sdk/getting-started).
293476
{% endhint%}

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp