I have an idea for a mobile app, but As I'm not from a Software background, I need to know how much the backend operations will cost before I commit time and resources to develop this idea. In this manner, I will be able to determine the likelihood of success for this project. I'm hoping this group's experts can assist me in getting an idea.
I intend to utilize a VPS as the backend server for this application. These are the assumptions
The application will have 100 users (referred to as user group A). Every 5 seconds, these users will write their GPS location coordinates to the VPS server database.
Every 5 seconds, another user (referred to as user B) will send his or her GPS location data to the VPS server and request the names of the ten nearest users from user group A. The server will take the location data, calculate the aerial distance for all 100 users to user B, and return the names of the nearest ten users to the user who sent the request.
To handle this process, I require estimates for the following resources:
- The minimum number of processor cores required
- The minimum RAM size required
- The SSD
- The bandwidth
Can anyone please assist me with this calculation? I would appreciate your assistance.If possible suggest the best VPS from the below linkhttps://www.ionos.com/servers/vps#packages
1 Answer1
tl;dr: Buy a small VPS instance, obtain PoC benchmark measurements, then multiply to obtain specs for a larger VPS instance.
The app will have 100 users. Every 5 seconds, a user will send a request
So roughly every 50 ms a new request will arrive.You are in control of offered load, so do both of these:
- Have clients add small random jitter delay to their sleep time. Else they will clump together and you'll see one second of idle followed by burst of activity and large queue depts.
- Send clients a "don't call me before this" timestamp. That way a server which gets a little behind can throttle demand.
mobile app
Don't worry about that aspect yet.Current goal is to obtain hard numbers
Write a small python app that usesrequeststo send simulated mobile app GET requests toyour proof-of-concept server.Or even a bash loop which callscurl.
As a first cut you could even have your webserverreturning a static document, and then you'rebenchmarking capacity.More realistic would be to have webserversleep(100) millisecondsbefore returning a constant"<div>Hello world!</div>" document.Do this on a tiny webserver, and slowly crank up the sleep timetill you break it, till the request queue depth becomes too largeand response times get ugly.Now you know what deadline the "real" queries must complete within.
For added realism, tackle the "find nearest ten" requirement.Standard solution would be to throw the 100 (x, y) coordinatesinto a postgres /PostGIStable and issue top-K nearest neighbor queries.But since 100 is such a small number you could certainlybrute force it, doing linear scan of 99 entries on each query.
You said nothing about your clients, their speed,their need for neighbor info, nor the cost of imprecise results(other than 5 seconds out-of-date is acceptable).There's a good chance that a given client's top-ten neighborswill remain stable for minutes at a time.In which case, don't scanall clients.Just cache and verify that the previous response was still correct.
For example,A sends query at time t0,and the top-three neighbors are {B,E,F}at three specified (x, y) spots, their last-reported locations.(In this example I reduce K from 10 to 3.)We remember the answer sent, along with a three-neighbor bounding box.At times t1, t2, t3 those clients check in, reportingeither constant (x, y) or some small delta.At t4 we hear fromA again, and we check whether a {B,E,F}response would be valid. Compute itsbbox,verify no other clients have recently entered that box,and send the current {B,E,F} (x, y) locations.A PostGIS spatial index would be ideal, but this schemerequires just simple sorted indexes on x, and on y.Hmmm, a bbox with a long skinny aspect ratio can lose,so we'd want to pad them out to look more like a square.
You mentioned N=100 clients, but I worry that that numbermay increase in future. Naïve linear scans implies thatCPU cycles consumed each hour will scale O(N²) quadraticallywith client population size.Prefer smart data structures over simple brute force.
To ensure that client responses arrive quickly,we might have a background daemon continually pre-computingresponses. Suppose that top-five neighbors ofAwere {B,E,F,J,K}, and we cache that inanticipation ofA's next request.When request arrives, compute five distances toA,verify thatJ andK are still farther away thanthe other three, and send a response.No scanning of the whole client population was needed to satisfy the GET.Now our performance criterion is simply "How long doesit take for the background daemon to cycle through all clients?"
Explore related questions
See similar questions with these tags.
