System Design

Ok, we want to build a software product, but where should we start?

Well, let's first think about the functional and non-functional requirements of our product. This gives us a list of questions to answer.

Functional requirement (scope)

Non-functional requirement

System layout

After clarifying the functional and non-functional requirements of the system, we move on to sketching out the system layout.

Unless the non-functional requirements demand a system that supports millions or even billions of concurrent users and requests upfront, we want to start with the most basic design.

Forget about microservices, load-balancing, auto-scaling, message queue, async-processing, distributed storage, or big data. We simply don't need them.

When deciding whether we should add a specific component to our system, keep asking ourselves: will anything break if we don't add this component? If nothing actually breaks, that component is something we don't need to include in our system.

This practice minimizes the system complexity and thus the operational cost.

Basic design

Start with one monolithic server and one database.

In general, CRUD applications are I/O-bound and a server can handle around 10 K Requests Per Second (RPS).

If the application is CPU-bound, the concurrency is limited by the number of logical cores of the server (CPU socket count x physical cores per socket x logical cores per physical core). An enterprise-grade server can have up to 1 K logical cores (8 x 64 x 2).

The basic system design layout looks like the following.

Server bottleneck

As the application gains more users, the server struggles to handle that many concurrent requests and users start to feel the response latency increase. Concurrency becomes the bottleneck of our system due to the OS setting limits or system memory shortage.

OS setting limit customization

If the bottleneck is caused by the OS setting limits, we can tune those limits such as the open file descriptor number limit and the TCP stack depth limit. For example, we can lift the open file descriptor number limit from 1 K to 100 K.

Server vertical scale up

If the bottleneck is caused by the system memory shortage, we can upgrade the server to a larger instance with more system memory.

Server horizontal scale out

If OS setting limit customization and server vertical scale up don't fully resolve the concurrency bottleneck, we then need to horizontally scale out the server to a server cluster. We use a load balancer to distribute the request traffic among the servers.

Database bottleneck

As more data is stored in the database and more transactions happen at the same time, the database transaction latency rises to an unacceptable level and becomes the bottleneck of our system.

We can resolve this via adding a database cache, Content Delivery Network (CDN), and client-side cache.

See more about read and write transaction scaling in the database post.

Availability and reliability bottleneck

As our monolithic system grows bigger in size, it will face the availability and reliability challenge. We need to refactor the system to use the microservice architecture. We use an API gateway to filter requests and route them to different services, and for each service we have a load balancer to distribute traffic among many service replicas.

External API rate limit bottleneck

If a service depends on an external API and that external API has a rate limit, we can use a message queue to control the request sending rate based on the rate limit in a centralized fashion.

Tradeoff

Next step

Once we have decided how we structure the system, we can go ahead with the development cycle.

See also

←Previous Next→