How to Create LLM WebSocket Chat with llama.cpp?
Learn how to setup a WebSocket server and create basic chat with an open source LLM.

Preparations
Before starting this tutorial, it might be useful for you to familiarize yourself withthe other tutorials and documentation pages first:
- How to Serve LLM Completions (With llama.cpp)? - to setupllama.cpp server and configureResonance to use it
- WebSockets - to learn how WebSocket features areimplemented in Resonance
In this tutorial, we will also useStimulusfor the front-end code.
Once you have both Resonance andllama.cpp runing, we can continue.
Backend
Security
First, we must create a security gate and decide which users can use ourchat. Let's say all authenticated users can do that:
app/SiteActionGate/StartWebSocketJsonRPCConnectionGate.phpphp<?phpnamespace App\SiteActionGate;use App\Role;use Distantmagic\Resonance\Attribute\DecidesSiteAction;use Distantmagic\Resonance\Attribute\Singleton;use Distantmagic\Resonance\AuthenticatedUser;use Distantmagic\Resonance\SingletonCollection;use Distantmagic\Resonance\SiteAction;use Distantmagic\Resonance\SiteActionGate;#[DecidesSiteAction(SiteAction::StartWebSocketJsonRPCConnection)]#[Singleton(collection: SingletonCollection::SiteActionGate)]final readonly class StartWebSocketJsonRPCConnectionGate extends SiteActionGate{ public function can(?AuthenticatedUser $authenticatedUser): bool { return true === $authenticatedUser?->user->getRole()->isAtLeast(Role::User); }}
HTTP Responder
First, we need to create a route in our application that will render the HTMLchat.
We are using a few attributes:
Can- this is anAuthorization attributeit checks if user can connect withWebSockets. Ifthey can't - it will return 403 error page instead. That is optional, butwithoutCanattribute, all users will be able to visit the page, butthey won't be able to establish a WebSocket connection anyway.RespondsToHttp- registers routeSingleton- adds yourHttpRespondertoDependency Injection container.We added#[WantsFeature(attribute to enableFeature:: WebSocket) WebSocketserver.
app/HttpResponder/LlmChat.phpphp<?phpnamespace App\HttpResponder;use App\HttpRouteSymbol;use Distantmagic\Resonance\Attribute\Can;use Distantmagic\Resonance\Attribute\RespondsToHttp;use Distantmagic\Resonance\Attribute\Singleton;use Distantmagic\Resonance\Attribute\WantsFeature;use Distantmagic\Resonance\Feature;use Distantmagic\Resonance\HttpInterceptableInterface;use Distantmagic\Resonance\HttpResponder;use Distantmagic\Resonance\RequestMethod;use Distantmagic\Resonance\SingletonCollection;use Distantmagic\Resonance\SiteAction;use Distantmagic\Resonance\TwigTemplate;use Psr\Http\Message\ResponseInterface;use Psr\Http\Message\ServerRequestInterface;#[Can(SiteAction::StartWebSocketJsonRPCConnection)]#[RespondsToHttp( method: RequestMethod::GET, pattern: '/chat', routeSymbol: HttpRouteSymbol::LlmChat,)]#[Singleton(collection: SingletonCollection::HttpResponder)]#[WantsFeature(Feature::WebSocket)]final readonly class LlmChat extends HttpResponder{ public function respond(ServerRequestInterface $request, ResponseInterface $response): HttpInterceptableInterface { return new TwigTemplate('turbo/llmchat/index.twig'); }}
WebSocket Message Validation
First, we need to register our WebSocket RPC methods. Default WebSocketimplementation follows a simple RPC protocol. All messages need to be a tuple:
[method, payload, null|request id]Method, the first field, is validated againstRPCMethodInterface, throughRPCMethodValidator (you need to provide both):
app/RPCMethod.phpphp<?phpnamespace App;use Distantmagic\Resonance\EnumValuesTrait;use Distantmagic\Resonance\JsonRPCMethodInterface;enum JsonRPCMethod: string implements JsonRPCMethodInterface{ use EnumValuesTrait; case LlmChatPrompt = 'llm_chat_prompt'; case LlmToken = 'llm_token'; public function getValue(): string { return $this->value; }}
app/RPCMethodValidator.phpphp<?phpnamespace App;use Distantmagic\Resonance\Attribute\Singleton;use Distantmagic\Resonance\Attribute\WantsFeature;use Distantmagic\Resonance\Feature;use Distantmagic\Resonance\JsonRPCMethodInterface;use Distantmagic\Resonance\JsonRPCMethodValidatorInterface;#[Singleton(provides: JsonRPCMethodValidatorInterface::class)]#[WantsFeature(Feature::WebSocket)]readonly class JsonRPCMethodValidator implements JsonRPCMethodValidatorInterface{ public function cases(): array { return JsonRPCMethod::cases(); } public function castToRPCMethod(string $methodName): JsonRPCMethodInterface { return JsonRPCMethod::from($methodName); } public function values(): array { return JsonRPCMethod::values(); }}
The above means your application will supportllm_chat_prompt andllm_chat_token messages.
WebSocket Message Responder
Now, we need to register a WebSocket responder that will run every timesomeone sendsllama_chat_prompt message to the server.
It will then useLlamaCppClient to fetch tokens fromllama.cpp server and forward themto the WebSocket connection:
app/WebSocketJsonRPCResponder/LlmChatPromptResponder.phpphp<?phpnamespace App\WebSocketJsonRPCResponder;use App\JsonRPCMethod;use Distantmagic\Resonance\Attribute\RespondsToWebSocketJsonRPC;use Distantmagic\Resonance\Attribute\Singleton;use Distantmagic\Resonance\Attribute\WantsFeature;use Distantmagic\Resonance\Constraint;use Distantmagic\Resonance\Constraint\ObjectConstraint;use Distantmagic\Resonance\Constraint\StringConstraint;use Distantmagic\Resonance\Feature;use Distantmagic\Resonance\JsonRPCNotification;use Distantmagic\Resonance\JsonRPCRequest;use Distantmagic\Resonance\JsonRPCResponse;use Distantmagic\Resonance\LlamaCppClientInterface;use Distantmagic\Resonance\LlamaCppCompletionRequest;use Distantmagic\Resonance\SingletonCollection;use Distantmagic\Resonance\WebSocketAuthResolution;use Distantmagic\Resonance\WebSocketConnection;use Distantmagic\Resonance\WebSocketJsonRPCResponder;#[RespondsToWebSocketJsonRPC(JsonRPCMethod::LlmChatPrompt)]#[Singleton(collection: SingletonCollection::WebSocketJsonRPCResponder)]#[WantsFeature(Feature::WebSocket)]final readonly class LlmChatPromptResponder extends WebSocketJsonRPCResponder{ public function __construct( private LlamaCppClientInterface $llamaCppClient, ) {} public function getConstraint(): Constraint { return new ObjectConstraint( properties: [ 'prompt' => new StringConstraint(), ], ); } public function onNotification( WebSocketAuthResolution $webSocketAuthResolution, WebSocketConnection $webSocketConnection, JsonRPCNotification $rpcNotification, ): void { $request = new LlamaCppCompletionRequest($rpcNotification->payload->prompt); $completion = $this->llamaCppClient->generateCompletion($request); foreach ($completion as $token) { if ($webSocketConnection->status->isOpen()) { $webSocketConnection->push(new JsonRPCNotification( JsonRPCMethod::LlmToken, $token->content, )); } else { $completion->stop(); } } }}
Frontend
Stimulus controller starts WebSocketconnection and sends user prompts to the server. Not that WebSocketconnection uses bothCSRF Protection andjsonrpcProtocols:
resources/ts/controller_llmchat.tstypescriptimport { Controller } from "@hotwired/stimulus";import { stimulus } from "../stimulus";@stimulus("llmchat")export class controller_llmchat extends Controller<HTMLElement> { public static targets = ["userInputField", "userInputForm", "chatLog"]; public static values = { csrfToken: String, }; private declare readonly chatLogTarget: HTMLElement; private declare readonly csrfTokenValue: string; private declare readonly userInputFieldTarget: HTMLTextAreaElement; private webSocket: null|WebSocket = null; public connect(): void { const servertUrl = new URL(__WEBSOCKET_URL); servertUrl.searchParams.append("csrf", this.csrfTokenValue); const webSocket = new WebSocket(servertUrl, ["jsonrpc"]); webSocket.addEventListener("close", () => { this.webSocket = null; }); webSocket.addEventListener("open", () => { this.webSocket = webSocket; }); webSocket.addEventListener("message", (evt: MessageEvent) => { if ("string" !== typeof evt.data) { return; } const parsed: unknown = JSON.parse(evt.data); this.chatLogTarget.append(parsed.result); }); } public disconnect(): void { this.webSocket?.close(); } public onFormSubmit(evt: Event): void { evt.preventDefault(); this.chatLogTarget.innerHTML = ''; this.webSocket?.send(JSON.stringify({ jsonrpc: "2.0", method: "llm_chat_prompt", params: { prompt: this.userInputFieldTarget.value, } })); this.userInputFieldTarget.value = ''; }}
We will useAsset Bundling (esbuild) to bundle front-endTypeScript code.
shell$ ./node_modules/.bin/esbuild \ --bundle \ --define:global=globalThis \ --entry-names="./[name]_$(BUILD_ID)" \ --format=esm \ --log-limit=0 \ --metafile=esbuild-meta-app.json \ --minify \ --outdir=./$(BUILD_TARGET_DIRECTORY) \ --platform=browser \ --sourcemap \ --target=es2022,safari16 \ --tree-shaking=true \ --tsconfig=tsconfig.json \ resources/ts/controller_llmchat.ts \;
Finally, the HTML:
app/views/llmchat.twigtwig<script defer type="module" src="{{ esbuild(request, 'controller_llmchat.ts') }}"></script><div data-controller="llmchat" data-llmchat-csrf-token-value="{{ csrf_token(request, 'llm_chat) }}"> <div data-llmchat-target="chatLog"></div> <form data-action="submit->llmchat#onFormSubmit"> <input autofocus data-llmchat-target="userInputField" type="text" ></input> <button type="submit">Send</button> </form></div>
Summary
Now, you can serve LLM chat in your application. Enjoy!
Comments
If you want to leave a commentStart a discussion on GitHub or join ourCommunity