These two are most asked terms in a system design interview.
Latency is the delay in the network communication. It shows the time that takes to transfer across network. The concept can be applied to any aspect of a system where data is being requested and transferred.
Throughput refers to average volume of data that can actually pass through network over a specific time. It determines network’s capacity to accommodate multiple users simultaneously. Throughput is a measurement of how much is actually transmitted, and it is not the theoretical capacity (bandwidth) of the system.
An ideal system should have high throughput but low latency.
A network with low throughput and high latency struggles to send and process high volume which results in congestion and poor application performance.
We can measure network latency by measuring ping time in milliseconds (ms) and throughput in bits per second(bps).
Impacting factors on Latency:
- Location: The speed of light is the fastest anything can travel, so no matter how you design your system, transferring data through space will always take some time.
- Network Congestion: When there are many message requests coming in at once and the system doesn’t have the capacity to process them all, some requests will have to wait, increasing latency. Either these requests will be dropped and sent again, or they’ll sit in a queue waiting to be processed.
- Protocol efficiency
- Network infrastructure
Impacting factors on Throughput:
- Bandwidth
- Processing Power
- Packet Loss
- Network topology

How can we improve latency and throughput?
For latency, we can shorten the propagation between source and destination. Better paths minimize the number of nodes a request has to travel through can help improve latencies.
For throughput, we can increase the overall network bandwidth.
To improve both together, we may use caching, good transport protocols and Quality of service.
Caching: caching can dramatically improve latencies when applied correctly, by storing a copy of repeatedly accessed data for faster retrieval.
Protocol choice – certain protocols, like HTTP/2, intentionally reduce the amount of protocol overhead associated with a request, and can keep latency lower. TCP also has congestion avoidance features that can help mitigate congestion-related causes of high latencies.
Protocol choice – TCP has congestion avoidance features that can help mitigate congestion that causes low throughput.


Leave a comment