Planning your environment

Vrushali Raut
4 min readAug 12, 2023


When you design any backend server system from scratch you think about following thing primarily

QPS :- Query per second

TPS /TPM :- Throughput per second

RPS/ RPM :- Request per second/Minute

  • Peak Performance/Throughput
  • Response time -> What will be p99, median, average response time.
  • Accuracy -> Consistent -> perfect accuracy
  • Availability -> total uptime

Establishing performance goals

Planning the Network configuration

Planning for availability

Design Decision

Establishing performance goals

High performance = maximize throughput and reduce response time

  • what request capacity or throughput — must system support
  • how many concurrent users must system support
  • what is acceptable average response time
  • what is average think time between requests

Estimate throughput -

  • number of request processes per minute
  • Estimating load on Application server instances
  • Calculate maximum number of concurrent users
  • Calculate think time
  • Calculate average response time
  • Calculate request per minute

Calculate maximum number of concurrent users

  • A user is concurrent for as long as the user is on the system as a running process submitting requests, receiving results of requests from the server, and viewing the results.
  • Eventually, as the number of concurrent users submitting requests increases, requests processed per minute begins to decline (and the response time begins to increase)

Calculate think time

  • A user does not submit requests continuously. A user submits a request, the server receives the request, processes it and then returns a result, at which point the user spends some time analyzing the result before submitting a new request. The time spent reviewing the result of a request is called think time.

Calculate average response time

  • Response time refers to the amount of time it takes for the results of a request to be returned to the user.
  • The response time is affected by a number of factors, including network bandwidth, number of users, number and type of requests submitted, and average think time.
  • In this section, response time refers to the mean, or average, response time.
  • The faster the response time, the more requests per minute are being processed.
  • However, as the number of users on the system increases, the response time starts to increase as well, even though the number of requests per minute declines.

Response time at peak load -

Response time = (concurrent users / requests per second ) — think time in seconds


  • Max number of concurrent users system support at peak load ⇒ 5000
  • Max number of request system can process at peak load ⇒ 1000 per second
  • Average think time ⇒ 3 second per request

Response time = (5000/1000) — 3

response time = 2 seconds

- After the system’s response time has been calculated, particularly at peak load, compare it to the acceptable response time for the application.

- Response time, along with throughput, is one of the main factors critical to the Application Server performance

Calculate request per minute :

  • If number of concurrent users at any given time, the response time of their requests, and the average user think time is known ⇒ then the requests per minute can be calculated
  • The formula for obtaining the requests per second is as follows:

requests/s = concurrent users/ response time (s) + think time (s)

Example :-

  • Concurrent users ⇒ 2,800.
  • Average response time ⇒ 1 second per request.
  • Average think time ⇒ 3 seconds.

Requests per second = 2800 / (1+3)

Therefore, the number of requests per second ⇒ 700, the number of requests per minute is ⇒ (700 * 60 ) ⇒ 42000

Planning the Network configuration

  • estimate bandwidth requirement
  • estimate peak load

Planning for availability

  • Rightsizing Availability
  • Using Clusters to Improve Availability
  • Adding Redundancy to the System
  • One way to achieve high availability is to add hardware and software redundancy to the system. When one unit fails, the redundant unit takes over. This is also referred to as fault tolerance. In general, to achieve high availability, determine and remove every possible point of failure in the system.

Identifying Failure Classes

  • The level of redundancy is determined by the failure classes (types of failure) that the system needs to tolerate. Some examples of failure classes are:
  • System process
  • Machine
  • Power supply
  • Disk
  • Network failures
  • Building fires
  • Other catastrophes

Design Decision

  • Number of Application Server Instances Required
  • Number of Nodes Required
  • Storage Capacity Required
  • Designing for Peak Load or Steady State Load

Reference :- blog



Vrushali Raut

I’m a Engineer. Ex Spenmo, Gojek, Leftshift . I love to share my experiments, learnings via Blogs.