uNetworking/uWebSockets.jsPublic

NotificationsYou must be signed in to change notification settings
Fork611
Star8.6k

Scaling >1 instances with Redis#441

gitcatrat started this conversation inGeneral

gitcatrat

Jan 17, 2021

· 4 comments· 7 replies

Return to top

Discussion options

gitcatrat
Jan 17, 2021

I'm planning a distributed system and Redis seems to be the perfect glue piece that brings all that together.

Here's my initialnot stress tested idea. I'd be glad to hear your criticism or tips. Note that diagram only handles "direct message" topic for simplicity but it could subscribe to any number of topics and same concept applies.

PROS:

can scale WS servers horizontally
other backend systems don't need to know anything about this system, all they need to do is 1 function call
stateless servers in the grand scheme of things, doesn't matter one bit where connections land, for example 4 connections subscribed to same topic could all be connected to different servers, data flows to correct destinations
doesn't matter if message is for single or multiple connections, everything works the same way
doesn't need to deal with threads communication, other Node processes have their own redis subscriptions
redis topics are tracked on server, so max redis topic subscriptions count is # of Node processes - keeps redis quick

CONS:

need to scale Redis too if more WS servers are added
direct messagespublish as well which is less performant thansend but I think that's the easy route - alternative would be to store<userID: serverIP> in redis and implement HTTP API to servers for sending messages, it would require custom library or separate instance that looks up the connection server IP from redis and makes a request (comes with issues like invalidation)
even if communicating connections are on same WS server, they still need to talk through redis because it's much easier to assume that each topic could have 0, 1 or more subscribers and they might not be on same instance
I have no idea how well this system works with graceful and not so graceful connection close events, i.e is there any potential for ghost state in this system, unnecessary connections, etc

Go ahead and laugh - yes, this diagram is made with Mac version of Paint.

You must be logged in to vote

Replies: 4 comments 7 replies

Comment options

ghost
Jan 17, 2021

Well I think the best thing is to write a reliable benchmark and test it out at expected stress. I suspect Redis itself will bottleneck and then you end up with the need for a Redis master and multiple Redis slaves and then it kind of becomes an unnecessary layer you could do without in the first place.

What I do in many cases is I use 1 Redis master and a handful of WS slaves where the WS slaves only subscribe to 1 general topic so that the Redis master only has to send a handful of messages per 1 app message.

But really, write benchmarks and test multiple solutions and compare their respective results.

You must be logged in to vote

0 replies

Comment options

uasan
Jan 17, 2021

I advise you to try different clients for Redis.
Node.js some clients are quite slow.

You must be logged in to vote

0 replies

Comment options

vimalmistry
Jan 18, 2021

@gitcatrat I am following same structure for current app (Flutter). I am using redis for pub/sub as well for cache storage. Can you guide me how to benchmark? Which things should be considered. My app is on early development. I just checked my realtime chat app module. It's really quick.

@uasan I am using ioredis. Do you prefer this or have any idea. Other repos are outdated or does not support command like geospatial.

You must be logged in to vote

2 replies

Comment options

gitcatrat Jan 18, 2021
Author

I'll probably skip the Redis for now, it's unlikely that I'll need >1 instances for a long time and if that time arrives, I'll review it again, hopefully with more resources. This system requires too much plumbing if you want it to be resilient without human intervention.

All the stuff I don't want to deal with at the moment:

Redis pub/sub topic keyspace has to be sharded manually if you need >1 Redis instance
- You need to create a library that picks correct Redis instance based on topic name and list of available IPs (search forhashring), you need to use the same library everywhere you want to subscribe or publish to pub/sub system
- Where do you store the list of available Redis IPs? Hardcode? Then you have to deploy your whole backend and Redis instances at the same time. Another Redis? It has to be resilient as well. Durable cloud database like DynamoDB? Probably your best bet
If you need to reshard Redis instances (e.g you're adding or removing an instance)
- all subs/connections have to dropped (good luck with thundering herd)
- ws servers have to fetch the list of new Redis instances and resubscribe all connections to correct instance (hashring)
WS instances have to ping connected Redis instances and if down, A) switch to replica B) do previous 2 points again

I doubt this is the complete list of cases you have to handle and maybe there are easier ways to handle this.

Comment options

gitcatrat Jan 18, 2021
Author

I'm not sure if any of the open source proxies support pub/sub use case, e.g twemproxy or envoy.

Comment options

uasan
Jan 18, 2021

@uasan I am using ioredis. Do you prefer this or have any idea. Other repos are outdated or does not support command like geospatial.

We use a custom client for Redis.
Two reasons why we wrote our client a year ago:

implementation of client side caching
multiplex (many pub/sub one connection)

This is the killer feature Redis 6, I don't know if your client supports them
https://redis.io/topics/client-side-caching

You must be logged in to vote

5 replies

Comment options

ghost Jan 18, 2021

That sounds a lot like an "Edge database" (which is a buzzword). I swear - all databases are essentially the same very thing. Only slight variations in what they optimize for (transactions or analytics). The rest is marketing fuzz and of course more or less good implementations.

Comment options

uasan Jan 22, 2021

@alexhultman We know that Node.js is trash for you, but what do you think about Redis, is not trash?

Comment options

ghost Jan 22, 2021

Redis is fine. In my I/O benchmarks with pub/sub it beats all contenders by far (tested probably 12 MQTT servers). Mosca, the one written by Matteo in JavaScript was by far the slowest.

If you look at the commit stats of redis you can see 99% is written by one guy under one company - I think that's important for keeping on track with what you aim for. Nodejs on the other hand is written by so so many different people with conflicting vision, competence and feel for quality.

Comment options

ghost Jan 22, 2021

Nodejs doesn't even screen for competence anymore (did they ever?) - if you have purple enough hair you're free to hack on things you have no idea about

Comment options

gitcatrat Jan 22, 2021
Author

I kind of agree with that. Products can be messy, rushed and clued together Frankenstein monsters because sometimes it's more important to deploy the feature (retain a high profile client, etc) instead of saving $500 optimizing the code and decrease the hardware requirements.

I think good tools shouldn't have that luxury. But the range of quality, focus, etc comes with the open source territory.

Movatterモバイル変換

Scaling >1 instances with Redis#441

Uh oh!

Uh oh!

Replies: 4 comments· 7 replies

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gitcatratJan 18, 2021 Author

Uh oh!

Uh oh!

gitcatratJan 18, 2021 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gitcatratJan 22, 2021 Author

Uh oh!

Replies: 4 comments 7 replies

gitcatrat Jan 18, 2021
Author

gitcatrat Jan 18, 2021
Author

gitcatrat Jan 22, 2021
Author