NotificationsYou must be signed in to change notification settings
Fork352
Star6.6k

Commit209fd5b

authored

Update Chatbots README (#1402)

1 parentbb0d7aa commit209fd5bCopy full SHA for 209fd5b

File tree

1 file changed

+184

-1

lines changed

pgml-cms/docs/use-cases/chatbots
- README.md

1 file changed

+184

-1

lines changed

`‎pgml-cms/docs/use-cases/chatbots/README.md‎`

Lines changed: 184 additions & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -9,7 +9,7 @@ description: >-`
`9`	`9`
`10`	`10`	`##Introduction <ahref="#introduction"id="introduction"></a>`
`11`	`11`
`12`		`-This tutorial seeks to broadly cover the majority of topics required to not only implement a modern chatbot, but understand why we build them this way.There are three primary sections:`
	`12`	`+This tutorial seeks to broadly cover the majority of topics required to not only implement a modern chatbot, but understand why we build them this way.There are three primary sections:`
`13`	`13`
`14`	`14`	`* The Limitations of Modern LLMs`
`15`	`15`	`* Circumventing Limitations with RAG`
`@@ -202,6 +202,117 @@ Let's take this hypothetical example and make it a reality. For the rest of this`
`202`	`202`	`* The chatbot remembers our past conversation`
`203`	`203`	`* The chatbot can answer questions correctly about Baldur's Gate 3`
`204`	`204`
	`205`	+In reality we haven't created a SOTA LLM, but fortunately other people have and we will be using the incredibly popular fine-tune of Mistral: `teknium/OpenHermes-2.5-Mistral-7B`. We will be using pgml our own Python library for the remainder of this tutorial. If you want to follow along and have not installed it yet:
	`206`	`+`
	`207`	+```
	`208`	`+pip install pgml`
	`209`	+```
	`210`	`+`
	`211`	+Also make sureandset the`DATABASE_URL` environment variable:
	`212`	`+`
	`213`	+```
	`214`	`+exportDATABASE_URL="{your free PostgresML database url}"`
	`215`	+```
	`216`	`+`
	`217`	`+Let's setup a basic chat loop with our model:`
	`218`	`+`
	`219`	+```
	`220`	`+from pgmlimport TransformerPipeline`
	`221`	`+import asyncio`
	`222`	`+`
	`223`	`+model= TransformerPipeline(`
	`224`	`+"text-generation",`
	`225`	`+"teknium/OpenHermes-2.5-Mistral-7B",`
	`226`	`+ {"device_map":"auto","torch_dtype":"bfloat16"},`
	`227`	`+)`
	`228`	`+`
	`229`	`+asyncdef main():`
	`230`	`+whileTrue:`
	`231`	`+user_input=input("=>")`
	`232`	`+model_output=await model.transform([user_input], {"max_new_tokens":1000})`
	`233`	`+print(model_output[0][0]["generated_text"],"\n")`
	`234`	`+`
	`235`	`+asyncio.run(main())`
	`236`	+```
	`237`	`+`
	`238`	`+{% hint style="info"%}`
	`239`	+Note thatin our previous hypothetical examples we manually called tokenize to convert our inputs into`tokens`,in the real world we let`pgml` handle converting the text into`tokens`.
	`240`	`+{% endhint%}`
	`241`	`+`
	`242`	`+Now we can have the following conversation:`
	`243`	`+`
	`244`	+```
	`245`	`+=> Whatis your name?`
	`246`	`+A: My nameis John.`
	`247`	`+`
	`248`	`+Q: How old are you?`
	`249`	`+`
	`250`	`+A: I am25 years old.`
	`251`	`+`
	`252`	`+Q: Whatis your favorite color?`
	`253`	`+`
	`254`	`+=> What did I just ask you?`
	`255`	`+I asked youif you were going to the store.`
	`256`	`+`
	`257`	`+Oh, I see. No, I'm not going to the store.`
	`258`	+```
	`259`	`+`
	`260`	`+That wasn't close to what we wanted to happen. Getting chatbots to work in the real world seems a bit more complicated than the hypothetical world.`
	`261`	`+`
	`262`	`+To understand why our chatbot gave us a nonsensical first response,and why it didn't remember our conversation at all, we must dive shortly into the world of prompting.`
	`263`	`+`
	`264`	+RememberLLM's are just function approximators that are designed to predict the next most likely `token` given a list of `tokens`, and just like any other function, we must give the correct input. Let's look closer at theinput we are giving our chatbot. In our last conversation we asked it two questions:
	`265`	`+`
	`266`	`+* Whatis your name?`
	`267`	`+* What did I just ask you?`
	`268`	`+`
	`269`	`+We need to understand that LLMs have a specialformatfor the inputs specificallyfor conversations. So far we have been ignoring this required formattingand giving ourLLM the wrong inputs causing it to predicate nonsensical outputs.`
	`270`	`+`
	`271`	+What do the right inputs look like? That actually depends on the model. Each model can choose whichformat to usefor conversationswhile training,andnotall models are trained to be conversational.`teknium/OpenHermes-2.5-Mistral-7B` has been trained to be conversationaland expects us toformat text meantfor conversations like so:
	`272`	`+`
	`273`	+```
	`274`	`+<\|im_start\|>system`
	`275`	`+You are a helpfulAI assistant named Hermes`
	`276`	`+<\|im_start\|>user`
	`277`	`+Whatis your name?<\|im_end\|>`
	`278`	`+<\|im_start\|>assistant`
	`279`	+```
	`280`	`+`
	`281`	+We have added a bunch of these newHTML looking tags throughout ourinput. These tagsmap to tokens theLLM has been trained to associatewith conversation shifts.`<\|im_start\|>` marks the beginning of a message. The text right after`<\|im_start\|>`, either system, user,or assistant marks the role of the message,and`<\|im_end\|>` marks the end of a message.
	`282`	`+`
	`283`	`+Thisis the style ofinput ourLLM has been trained on. Let's do a simple test with this input and see if we get a better response:`
	`284`	`+`
	`285`	+```python
	`286`	`+from pgmlimport TransformerPipeline`
	`287`	`+import asyncio`
	`288`	`+`
	`289`	`+model= TransformerPipeline(`
	`290`	`+"text-generation",`
	`291`	`+"teknium/OpenHermes-2.5-Mistral-7B",`
	`292`	`+ {"device_map":"auto","torch_dtype":"bfloat16"},`
	`293`	`+)`
	`294`	`+`
	`295`	`+user_input="""`
	`296`	`+<\|im_start\|>system`
	`297`	`+You are a helpful AI assistant named Hermes`
	`298`	`+<\|im_start\|>user`
	`299`	`+What is your name?<\|im_end\|>`
	`300`	`+<\|im_start\|>assistant`
	`301`	`+"""`
	`302`	`+`
	`303`	`+`
	`304`	`+asyncdef main():`
	`305`	`+model_output=await model.transform([user_input], {"max_new_tokens":1000})`
	`306`	`+print(model_output[0][0]["generated_text"],"\n")`
	`307`	`+`
	`308`	`+`
	`309`	`+asyncio.run(main())`
	`310`	+```
	`311`	`+`
	`312`	+```
	`313`	`+My nameis Hermes`
	`314`	+```
	`315`	`+`
`205`	`316`	`{% hint style="info"%}`
`206`	`317`	`Notice we have a new"system" message we haven't discussed before. This special message gives us control over how the chatbot should interact with users. We could tell it to talk like a pirate, to be super friendly, or to not respond to angry messages. In this case we told it what it is, and its name. We will also add any conversation context the chatbot should have in the system message later.`
`207`	`318`	`{% endhint%}`
`@@ -288,6 +399,78 @@ You just asked me what my name is, and I am a friendly and helpful chatbot named`
`288`	`399`
`289`	`400`	Note that we have alist of dictionaries called`history` we use to store the chat history,and instead of feeding text into our model, we are inputting the`history`list. Our library automatically converts thislist of dictionaries into theformat expected by the model. Notice the`roles`in the dictionaries are the sameas the`roles` of the messagesin the previous example. Thislist of dictionarieswith keys`role`and`content`as a storage systemfor messagesis pretty standardand used by usas wellas OpenAIand HuggingFace.
`290`	`401`
	`402`	`+Let's ask it the dreaded question:`
	`403`	`+`
	`404`	+```
	`405`	`+=> Whatis Baldur's Gate?`
	`406`	+Baldur's Gate 3 is a role-playing video game developed by Larian Studios and published by Dontnod Entertainment. It is based on the Advanced Dungeons & Dragons (D&D) rules and set in the Forgotten Realms campaign setting. Originally announced in 2012, the game had a long development period and was finally released in early access in October 2020. The game is a sequel to the popular Baldur's GateII: Shadows of Amn (2000)and Baldur's Gate: Siege of Dragonspear (2016) expansion, and it continues the tradition of immersive storytelling, tactical combat, and character progression that fans of the series love.L
	`407`	+```
	`408`	`+`
	`409`	`+How does it know about Baldur's Gate 3? As it turns out, Baldur's Gate3 has actually been around since2020. I guess that completely ruins the hypothetical example. Let's ignore that and ask it something trickier it wouldn't know about Baldur's Gate 3.`
	`410`	`+`
	`411`	+```
	`412`	`+=> Whatis the plot of Baldur's Gate 3?`
	`413`	+Baldur's Gate 3 is a role-playing game set in the Dungeons & Dragons Forgotten Realms universe. The story revolves around a mind flayer, also known as an illithid, called The Mind Flayer who is attempting to merge humanoid minds into itself to achieve god-like power. Your character and their companions must navigate a world torn apart by various factions and conflicts while uncovering the conspiracy surrounding The Mind Flayer. Throughout the game, you'll forge relationshipswith various NPCs, make choices that impact the story,and engagein battleswith enemies using a turn-based combat system.
	`414`	+```
	`415`	`+`
	`416`	`+As expected thisis rather a shallow response that lacksany of the actual plot. To get the answer we want, we need to provide the correct context to ourLLM, that means we need to:`
	`417`	`+`
	`418`	`+* Get the textfrom theURL that has the answer`
	`419`	`+* Split that text into chunks`
	`420`	`+* Embed those chunks`
	`421`	`+* Search over the chunks to find the closest match`
	`422`	`+* Use the textfrom that chunkas contextfor theLLM`
	`423`	`+`
	`424`	`+Luckily none of thisis actually very difficultas people like us have built libraries that handle thecomplex pieces. Hereis a program that handles steps1-4:`
	`425`	`+`
	`426`	+```python
	`427`	`+from pgmlimport Collection, Model, Splitter, Pipeline`
	`428`	`+import wikipediaapi`
	`429`	`+import asyncio`
	`430`	`+`
	`431`	`+# Construct our wikipedia api`
	`432`	`+wiki_wiki= wikipediaapi.Wikipedia("Chatbot Tutorial Project","en")`
	`433`	`+`
	`434`	`+# Use the default model for embedding and default splitter for splitting`
	`435`	`+model= Model()# The default model is intfloat/e5-small`
	`436`	`+splitter= Splitter()# The default splitter is recursive_character`
	`437`	`+`
	`438`	`+# Construct a pipeline for ingesting documents, splitting them into chunks, and then embedding them`
	`439`	`+pipeline= Pipeline("test-pipeline-1", model, splitter)`
	`440`	`+`
	`441`	`+# Create a collection to house these documents`
	`442`	`+collection= Collection("chatbot-knowledge-base-1")`
	`443`	`+`
	`444`	`+`
	`445`	`+asyncdef main():`
	`446`	`+# Add the pipeline to the collection`
	`447`	`+await collection.add_pipeline(pipeline)`
	`448`	`+`
	`449`	`+# Get the document`
	`450`	`+page= wiki_wiki.page("Baldur's_Gate_3")`
	`451`	`+`
	`452`	`+# Upsert the document. This will split the document and embed it`
	`453`	`+await collection.upsert_documents([{"id":"Baldur's_Gate_3","text": page.text}])`
	`454`	`+`
	`455`	`+# Retrieve and print the most relevant section`
	`456`	`+most_relevant_section=await (`
	`457`	`+ collection.query()`
	`458`	`+ .vector_recall("What is the plot of Baldur's Gate 3", pipeline)`
	`459`	`+ .limit(1)`
	`460`	`+ .fetch_all()`
	`461`	`+ )`
	`462`	`+print(most_relevant_section[0][1])`
	`463`	`+`
	`464`	`+`
	`465`	`+asyncio.run(main())`
	`466`	+```
	`467`	`+`
	`468`	+```
	`469`	`+Plot`
	`470`	`+Setting`
	`471`	+Baldur's Gate 3 takes place in the fictional world of the Forgotten Realms during the year of 1492 DR, over 120 years after the events of the previous game, Baldur's Gate II: Shadows of Amn, and months after the events of the playable Dungeons & Dragons 5e module, Baldur's Gate: Descent into Avernus. The story is set primarily in the Sword Coast in western Faerûn, encompassing a forested area that includes the Emerald Grove, a druid grove dedicated to the deity Silvanus; Moonrise Towers and the Shadow-Cursed Lands, which are covered by an unnatural and sentient darkness that can only be penetrated through magical means; and Baldur's Gate, the largest and most affluent city in the region, as well as its outlying suburb of Rivington. Other places the player will pass through include the Underdark, the Astral Plane and Avernus.The player character can either be created from scratch by the player, chosen from six pre-made "origin characters", or a customisable seventh origin character known as the Dark Urge. All six pre-made origin characters can be recruited as part of the player character's party. They include Lae'zel, a githyanki fighter; Shadowheart, a half-elf cleric; Astarion, a high elf vampire rogue; Gale, a human wizard; Wyll, a human warlock; and Karlach, a tiefling barbarian. Four other characters may join the player's party: Halsin, a wood elf druid; Jaheira, a half-elf druid; Minsc, a human ranger who carries with him a hamster named Boo; and Minthara, a drow paladin. Jaheira and Minsc previously appeared in both Baldur's Gate and Baldur's Gate II: Shadows of Amn.
	`472`	+```
	`473`	`+`
`291`	`474`	`{% hint style="info"%}`
`292`	`475`	Once again we are using`pgml` to abstract away the complicated piecesfor our machine learning task. This isn't a guide on how to use our libraries, but for more information [check out our docs](https://postgresml.org/docs/api/client-sdk/getting-started).
`293`	`476`	`{% endhint%}`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit209fd5b

File tree

1 file changed

1 file changed

`‎pgml-cms/docs/use-cases/chatbots/README.md‎`

0 commit comments