Building a WebSocket Chat service for Cloud Run tutorial

This tutorial shows how to create a multi-room, realtime chat service usingWebSockets with a persistent connection for bidirectional communication. WithWebSockets, both client and server can push messages to each other withoutpolling the server for updates.

Although you can configure Cloud Run to usesession affinity, this provides abest effort affinity, which means that any new request can still bepotentially routed to a different instance. As a result, user messagesin the chat service need to be synchronized across all instances, notjust between the clients connected to one instance.

Design for a realtime chat service

This sample chat service uses a Memorystore for Redis instance to store andsynchronize user messages across all instances. Redis uses aPub/Submechanism, not to be confused with the productCloud Pub/Sub,to push data to subscribed clients connected to any instance, to eliminate HTTPpolling for updates.

However, even with push updates, any instance that is spun up will only receivenew messages pushed to the container. To load prior messages, message historywould need to be stored and retrieved from a persistent storage solution. Thissample uses Redis's conventional functionality of an object store to cache andretrieve message history.

The Redis instance is protected from the internet using private IPs with accesscontrolled and limited to services running on the same Virtual Private Networkas the Redis instance. We recommend that you useDirect VPC egress.

Limitations

  • This tutorial does not show end user authentication or session caching. Tolearn more about end user authentication, refer to the Cloud Runtutorial forend user authentication.

  • This tutorial does not implement a database such as Firestore forindefinite storage and retrieval of chat message history.

  • Additional elements are needed for this sample service to be production ready.AStandard Tier Redis instance isrecommended to provideHigh Availabilityusing replication and automatic failover.

Objectives

  • Write, build, and deploy a Cloud Run service that uses WebSockets.

  • Connect to a Memorystore for Redis instance to publish and subscribe to newmessages across instances.

  • Connect the Cloud Run service with Memorystore usingDirect VPC egress.

Costs

In this document, you use the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use thepricing calculator.

New Google Cloud users might be eligible for afree trial.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.
    Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project.

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.
    Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

    Go to project selector

  5. Verify that billing is enabled for your Google Cloud project.

  6. Enable the Cloud Run, Memorystore for Redis, Artifact Registry, and Cloud Build APIs.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

    Enable the APIs

  7. Install and initialize the gcloud CLI.

Required roles

To get the permissions that you need to complete the tutorial, ask your administrator to grant you the following IAM roles on your project:

For more information about granting roles, seeManage access to projects, folders, and organizations.

You might also be able to get the required permissions throughcustom roles or otherpredefined roles.

Note:IAM basic roles might also contain permissions to complete the tutorial. You shouldn't grant basic roles in a production environment, but you can grant them in a development or test environment.

Set up gcloud defaults

To configure gcloud with defaults for your Cloud Run service:

  1. Set your default project:

    gcloudconfigsetprojectPROJECT_ID

    ReplacePROJECT_ID with the name of the project you created forthis tutorial.

  2. Configure gcloud for your chosen region:

    gcloudconfigsetrun/regionREGION

    ReplaceREGION with the supported Cloud Runregionof your choice.

Cloud Run locations

Cloud Run is regional, which means the infrastructure thatruns your Cloud Run services is located in a specific region and ismanaged by Google to be redundantly available acrossall the zones within that region.

Meeting your latency, availability, or durability requirements are primaryfactors for selecting the region where your Cloud Run services are run.You can generally select the region nearest to your users but you should considerthe location of theother Google Cloudproducts that are used by your Cloud Run service.Using Google Cloud products together across multiple locations can affectyour service's latency as well as cost.

Cloud Run is available in the following regions:

Subject toTier 1 pricing

  • asia-east1 (Taiwan)
  • asia-northeast1 (Tokyo)
  • asia-northeast2 (Osaka)
  • asia-south1 (Mumbai, India)
  • europe-north1 (Finland)leaf iconLow CO2
  • europe-north2 (Stockholm)leaf iconLow CO2
  • europe-southwest1 (Madrid)leaf iconLow CO2
  • europe-west1 (Belgium)leaf iconLow CO2
  • europe-west4 (Netherlands)leaf iconLow CO2
  • europe-west8 (Milan)
  • europe-west9 (Paris)leaf iconLow CO2
  • me-west1 (Tel Aviv)
  • northamerica-south1 (Mexico)
  • us-central1 (Iowa)leaf iconLow CO2
  • us-east1 (South Carolina)
  • us-east4 (Northern Virginia)
  • us-east5 (Columbus)
  • us-south1 (Dallas)leaf iconLow CO2
  • us-west1 (Oregon)leaf iconLow CO2

Subject toTier 2 pricing

  • africa-south1 (Johannesburg)
  • asia-east2 (Hong Kong)
  • asia-northeast3 (Seoul, South Korea)
  • asia-southeast1 (Singapore)
  • asia-southeast2 (Jakarta)
  • asia-south2 (Delhi, India)
  • australia-southeast1 (Sydney)
  • australia-southeast2 (Melbourne)
  • europe-central2 (Warsaw, Poland)
  • europe-west10 (Berlin)
  • europe-west12 (Turin)
  • europe-west2 (London, UK)leaf iconLow CO2
  • europe-west3 (Frankfurt, Germany)
  • europe-west6 (Zurich, Switzerland)leaf iconLow CO2
  • me-central1 (Doha)
  • me-central2 (Dammam)
  • northamerica-northeast1 (Montreal)leaf iconLow CO2
  • northamerica-northeast2 (Toronto)leaf iconLow CO2
  • southamerica-east1 (Sao Paulo, Brazil)leaf iconLow CO2
  • southamerica-west1 (Santiago, Chile)leaf iconLow CO2
  • us-west2 (Los Angeles)
  • us-west3 (Salt Lake City)
  • us-west4 (Las Vegas)

If you already created a Cloud Run service, you can view theregion in the Cloud Run dashboard in theGoogle Cloud console.

Retrieve the code sample

To retrieve the code sample for use:

  1. Clone the sample repository to your local machine:

    Node.js

    gitclonehttps://github.com/GoogleCloudPlatform/nodejs-docs-samples.git

    Alternatively, you can download the sample as a zip file and extract it.

  2. Change to the directory that contains the Cloud Run samplecode:

    Node.js

    cdnodejs-docs-samples/run/websockets/

Understand the WebSockets code

Socket.io is a library that enables real time, bidirectional communicationbetween the browser and server. Although Socket.io is not a WebSocket implementation,it does wrap the functionality to provide a simpler API for multiplecommunication protocols with additionalfeaturessuch as improved reliability,automatic reconnection,and broadcasting to all or a subset of clients.

Client-side integration

<scriptsrc="/socket.io/socket.io.js"></script>

The client instantiates a newSocket instancefor every connection. Because this sample is server side rendered the serverURL does not need to be defined. The socket instance can emit and listen to events.

// Initialize Socket.ioconstsocket=io('',{transports:['websocket'],});
// Emit "sendMessage" event with messagesocket.emit('sendMessage',msg,error=>{if(error){console.error(error);}else{// Clear message$('#msg').val('');}});
// Listen for new messagessocket.on('message',msg=>{log(msg.user,msg.text);});// Listen for notificationssocket.on('notification',msg=>{log(msg.title,msg.description);});// Listen connect eventsocket.on('connect',()=>{console.log('connected');});
Note: For amultiple instancesarchitecture without session affinity, Socket.io must only use the Websockettransport. If Socket.io falls back to long polling, it will send multiple HTTPrequests during the lifetime of the session. The first request establishes a"handshake" with importantinformation such as the session ID. After the first request, requests might berouted to another instance that will generate an error due to the session IDbeing unknown to the server.

Server-side integration

On the server side, the Socket.io server is initialized and attached to the HTTPserver. Similar to the client side, once the Socket.io server makes a connectionto the client, a socket instance is created for every connection which can beused to emit and listen to messages. Socket.io also provides an interfacefor creating "rooms" or an arbitrary channel that sockets can join and leave.

// Initialize Socket.ioconstserver=require('http').Server(app);constio=require('socket.io')(server);const{createAdapter}=require('@socket.io/redis-adapter');// Replace in-memory adapter with RedisconstsubClient=redisClient.duplicate();io.adapter(createAdapter(redisClient,subClient));// Add error handlersredisClient.on('error',err=>{console.error(err.message);});subClient.on('error',err=>{console.error(err.message);});// Listen for new connectionio.on('connection',socket=>{// Add listener for "signin" eventsocket.on('signin',async({user,room},callback)=>{try{// Record socket ID to user's name and chat roomaddUser(socket.id,user,room);// Call join to subscribe the socket to a given channelsocket.join(room);// Emit notification eventsocket.in(room).emit('notification',{title:"Someone's here",description:`${user} just entered the room`,});// Retrieve room's message history or return nullconstmessages=awaitgetRoomFromCache(room);// Use the callback to respond with the room's message history// Callbacks are more commonly used for event listeners than promisescallback(null,messages);}catch(err){callback(err,null);}});// Add listener for "updateSocketId" eventsocket.on('updateSocketId',async({user,room})=>{try{addUser(socket.id,user,room);socket.join(room);}catch(err){console.error(err);}});// Add listener for "sendMessage" eventsocket.on('sendMessage',(message,callback)=>{// Retrieve user's name and chat room  from socket IDconst{user,room}=getUser(socket.id);if(room){constmsg={user,text:message};// Push message to clients in chat roomio.in(room).emit('message',msg);addMessageToCache(room,msg);callback();}else{callback('User session not found.');}});// Add listener for disconnectionsocket.on('disconnect',()=>{// Remove socket ID from listconst{user,room}=deleteUser(socket.id);if(user){io.in(room).emit('notification',{title:'Someone just left',description:`${user} just left the room`,});}});});

Socket.io also provides a Redis adapter to broadcast events to all clientsregardless of which server is serving the socket. Socket.io only uses Redis'sPub/Sub mechanism and does not store any data.

const{createAdapter}=require('@socket.io/redis-adapter');// Replace in-memory adapter with RedisconstsubClient=redisClient.duplicate();io.adapter(createAdapter(redisClient,subClient));

Socket.io's Redis adapter can reuse the Redis client used to store the room'smessage history. Each container will create a connection to the Redis instanceand Cloud Run can create a large number of instances. This is wellunder the 65,000 connections that Redis can support.

Note: TheCloud Run container lifecycledoes not take into consideration outbound requests such as Redis connections;therefore an active connection to Redis won't prevent the container from scaling in.

Reconnection

Cloud Run has a maximum timeout of 60minutes. So you need to add reconnection logic for possible timeouts. In somecases, Socket.io automatically attempts to reconnect afterdisconnection or connection errorevents. There is no guarantee that the client will reconnect to the same instance.

// Listen for reconnect eventsocket.io.on('reconnect',()=>{console.log('reconnected');// Emit "updateSocketId" event to update the recorded socket ID with user and roomsocket.emit('updateSocketId',{user,room},error=>{if(error){console.error(error);}});});
// Add listener for "updateSocketId" eventsocket.on('updateSocketId',async({user,room})=>{try{addUser(socket.id,user,room);socket.join(room);}catch(err){console.error(err);}});

Instances will persist if there is an active connection until all requests closeor time out. Even if you use Cloud Runsession affinity,it is possible for new requests to be load balanced to active containers, whichallows containers to scale in. If you are concerned about large numbers ofcontainers persisting after a spike in traffic, you can lower the maximumtimeout value, so that unused sockets are cleaned up more frequently.

Ship the service

  1. Create a Memorystore for Redis instance:

    gcloudredisinstancescreateINSTANCE_ID--size=1--region=REGION

    Replace the following:

    • INSTANCE_ID: the name for the instance—for example,my-redis-instance.
    • REGION_ID: theregionfor all your resources and services—for example,europe-west1.

    Instance will be automatically allocated an IP range from the defaultservice network range. This tutorial uses 1GB of memory for the local cacheof messages in the Redis instance. Learn more aboutDetermining the initial size of a Memorystore instance for your use case.

  2. Define an environment variable with the IP address of your Redis instance'sauthorized network:

    exportREDISHOST=$(gcloudredisinstancesdescribeINSTANCE_ID--regionREGION--format"value(host)")
    Note: This tutorial uses the default value for the Redis port,6379.
  3. Create a service account to serve as the service identity. By default thishas no privileges other than project membership.

    gcloudiamservice-accountscreatechat-identitygcloudprojectsadd-iam-policy-bindingPROJECT_ID\--member=serviceAccount:chat-identity@PROJECT_ID.iam.gserviceaccount.com\--role=roles/serviceusage.serviceUsageConsumer
  4. Find the name of your Redis instance-authorized VPC networkby running the following command:

      gcloud redis instances describeINSTANCE_ID --regionREGION --format "value(authorizedNetwork)"

    Replace the following:

    Make note of the VPC network name.

  5. Build and deploy the container image to Cloud Run:

    gcloudrundeploychat-app--source.\--allow-unauthenticated\--timeout3600\--service-accountchat-identity\--networkNETWORK\--subnetSUBNET\--update-env-varsREDISHOST=$REDISHOST

    Replace the following:

    • NETWORK is the name of the authorizedVPC network that your Redis instance is attached to.
    • SUBNET is the name of your subnet. The subnet mustbe/26 or larger. Direct VPC egress supports IPv4 rangesRFC 1918,RFC 6598,and Class E.

    Respond to any prompts to install required APIs by respondingy whenprompted. You only need to do this once for a project. Respond to otherprompts by supplying the platform and region, if you haven't set defaultsfor these as described in thesetup page. Learn more aboutDeploying from source code.

Try the service

To try out the complete service:

  1. Navigate your browser to the URL provided by the deployment step.

  2. Add your name and a chat room to sign in.

  3. Send a message to the room!

Success: You built a multi-room, real-time chat service using WebSockets tomaintain a persistent connection for bi-directional communication.

If you choose to continue developing these services, remember that they haverestricted Identity and Access Management (IAM) access to the rest of Google Cloud andwill need to be given additional IAM roles to access many otherservices.

Clean up

To avoid additional charges to your Google Cloud account, delete all the resourcesyou deployed with this tutorial.

Delete the project

If you created a new project for this tutorial, delete the project.If you used an existing project and need to keep it without the changes you addedin this tutorial,delete resources that you created for the tutorial.

The easiest way to eliminate billing is to delete the project that you created for the tutorial.

To delete the project:

    Caution: Deleting a project has the following effects:
    • Everything in the project is deleted. If you used an existing project for the tasks in this document, when you delete it, you also delete any other work you've done in the project.
    • Custom project IDs are lost. When you created this project, you might have created a custom project ID that you want to use in the future. To preserve the URLs that use the project ID, such as anappspot.com URL, delete selected resources inside the project instead of deleting the whole project.

    If you plan to explore multiple architectures, tutorials, or quickstarts, reusing projects can help you avoid exceeding project quota limits.

  1. In the Google Cloud console, go to theManage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then clickDelete.
  3. In the dialog, type the project ID, and then clickShut down to delete the project.

Delete tutorial resources

  1. Delete the Cloud Run service you deployed in this tutorial.Cloud Run services don't incur costs until they receive requests.

    To delete your Cloud Run service, run the following command:

    gcloudrunservicesdeleteSERVICE-NAME

    ReplaceSERVICE-NAME with the name of your service.

    You can also delete Cloud Run services from theGoogle Cloud console.

  2. Remove thegcloud default region configuration you added during tutorialsetup:

    gcloudconfigunsetrun/region
  3. Remove the project configuration:

     gcloud config unset project
  4. Delete other Google Cloud resources created in this tutorial:

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-17 UTC.