Distributed System

A distributed system is a system consists of many services running independently on different servers (servers can be physical machines, VMs, or containers) and connected via a network. Although its services are distributed, it appears as a single entity to the end user.

Tradeoffs

In the system design post we shared that a distributed system has good:

But falls short at:

CAP theorem

CAP theorem states similar tradeoffs. It says we can only achieve two out of the three guarantees at any given time:

In a distributed system, partition tolerance is a requirement because network failures happen all the time, so we have to choose between consistency and availability.

Scaling

Here are a few things we need to know about scaling a distributed system.

Scaling example

Here is a simulated example for scaling a client-server application.

  1. Initially we launch the application with a minimal setup:
    • A client application
    • An application server
    • A database server
  2. As more and more user requests hitting the application server, our application server fails to handle them all, probably due to CPU, memory, and connection limit.
    • We can upgrade the application server to a more powerful machine (vertical scaling).
    • If that's not enough, we can set up more application servers and a load balancer to handle increasing traffic (horizontal scaling and load balancing).
    • We can also set up a CDN to cache static assets (caching).
  3. As more and more user data stored in our database, database query speed slows down.
    • We can upgrade the database server to a more powerful machine (vertical scaling).
    • Read
      • If that's not enough, we can put a cache between the application servers and the database to speed up database read operations (caching).
      • If that's not enough, we can do database replication (database replication).
    • Write
      • If that's not enough, we can shard the database into multiple smaller ones to accelerate write operations (database sharding).
  4. As we add more and more features, the application server fails to handle them all, probably due to CPU and memory limit, or dependency conflicts.
    • We can upgrade the application server to a more powerful machine (vertical scaling).
    • If that's not enough, we can split the application into multiple services and run each of them in a separate server. We also use an API gateway (or even a cluster of gateways) to route the requests to different services (routing).

The scaled client-server application has the following final layout:

Security

System security has become more and more important.

See also

←Previous Next→