Planning your environment

Vrushali Raut

4 min readAug 12, 2023

When you design any backend server system from scratch you think about following thing primarily

QPS :- Query per second

TPS /TPM :- Throughput per second

RPS/ RPM :- Request per second/Minute

Peak Performance/Throughput
Response time -> What will be p99, median, average response time.
Accuracy -> Consistent -> perfect accuracy
Availability -> total uptime

Establishing performance goals

Planning the Network configuration

Planning for availability

Design Decision

Establishing performance goals

High performance = maximize throughput and reduce response time

what request capacity or throughput — must system support
how many concurrent users must system support
what is acceptable average response time
what is average think time between requests

Estimate throughput -

number of request processes per minute
Estimating load on Application server instances
Calculate maximum number of concurrent users
Calculate think time
Calculate average response time
Calculate request per minute

Calculate maximum number of concurrent users

A user is concurrent for as long as the user is on the system as a running process submitting requests, receiving results of requests from the server, and viewing the results.
Eventually, as the number of concurrent users submitting requests increases, requests processed per minute begins to decline (and the response time begins to increase)

Calculate think time

A user does not submit requests continuously. A user submits a request, the server receives the request, processes it and then returns a result, at which point the user spends some time analyzing the result before submitting a new request. The time spent reviewing the result of a request is called think time.

Calculate average response time

Response time refers to the amount of time it takes for the results of a request to be returned to the user.
The response time is affected by a number of factors, including network bandwidth, number of users, number and type of requests submitted, and average think time.
In this section, response time refers to the mean, or average, response time.
The faster the response time, the more requests per minute are being processed.
However, as the number of users on the system increases, the response time starts to increase as well, even though the number of requests per minute declines.

Response time at peak load -

Response time = (concurrent users / requests per second ) — think time in seconds

example

Max number of concurrent users system support at peak load ⇒ 5000
Max number of request system can process at peak load ⇒ 1000 per second
Average think time ⇒ 3 second per request

Response time = (5000/1000) — 3

response time = 2 seconds

- After the system’s response time has been calculated, particularly at peak load, compare it to the acceptable response time for the application.

- Response time, along with throughput, is one of the main factors critical to the Application Server performance

Calculate request per minute :

If number of concurrent users at any given time, the response time of their requests, and the average user think time is known ⇒ then the requests per minute can be calculated
The formula for obtaining the requests per second is as follows:

requests/s = concurrent users/ response time (s) + think time (s)

Example :-

Concurrent users ⇒ 2,800.
Average response time ⇒ 1 second per request.
Average think time ⇒ 3 seconds.

Requests per second = 2800 / (1+3)

Therefore, the number of requests per second ⇒ 700, the number of requests per minute is ⇒ (700 * 60 ) ⇒ 42000

Planning the Network configuration

estimate bandwidth requirement
estimate peak load

Planning for availability

Rightsizing Availability
Using Clusters to Improve Availability
Adding Redundancy to the System
One way to achieve high availability is to add hardware and software redundancy to the system. When one unit fails, the redundant unit takes over. This is also referred to as fault tolerance. In general, to achieve high availability, determine and remove every possible point of failure in the system.

Identifying Failure Classes

The level of redundancy is determined by the failure classes (types of failure) that the system needs to tolerate. Some examples of failure classes are:
System process
Machine
Power supply
Disk
Network failures
Building fires
Other catastrophes

Design Decision

Number of Application Server Instances Required
Number of Nodes Required
Storage Capacity Required
Designing for Peak Load or Steady State Load