Post

System Design Compendium

System Design Compendium

Whether you are preparing for a system design interview or building production infrastructure, the same core concepts keep showing up: how data moves, where it lives, and what happens when things fail. This compendium collects those concepts in one place so you can build a solid mental model of how large-scale systems work.

Topics

  • Defining Software Architecture - the principles, patterns, and trade-offs behind structuring maintainable, scalable software systems.
  • Network Protocols - how machines communicate over networks, covering TCP, UDP, HTTP, and other foundational protocols.
  • Latency and Throughput - the two key performance metrics for any system, what they measure, and why optimizing one does not automatically improve the other.
  • Availability - designing systems that stay operational under failure, including uptime guarantees, redundancy strategies, and SLAs.
  • CAP Theorem - the fundamental trade-off between consistency, availability, and partition tolerance in distributed systems.
  • Caching - storing frequently accessed data closer to the consumer to reduce latency and backend load.
  • Proxies - intermediary servers that sit between clients and backends, enabling load distribution, security, and caching.
  • Load Balancing - distributing incoming traffic across multiple servers to maximize throughput and minimize response times.
  • Hashing - mapping data to fixed-size values for efficient lookups, and how consistent hashing enables scalable distributed systems.
  • SQL vs NoSQL Databases - comparing relational and non-relational databases, their data models, and when to choose each.
  • Specialized Storage Paradigms - purpose-built storage solutions like blob stores, time-series databases, and graph databases for domain-specific workloads.
  • Replication and Sharding - techniques for duplicating and partitioning data across machines to improve reliability and performance.
  • Leader Election - how distributed systems choose a single node to coordinate work, ensuring consistency without conflicts.
  • Peer-To-Peer Networks - decentralized architectures where nodes share resources directly without relying on a central server.
  • Polling and Streaming - two approaches for clients to receive updates from servers, each with different latency and resource trade-offs.
  • Rate Limiting - controlling the number of requests a client can make to protect systems from overload and abuse.
  • Microservices vs Monolith - comparing two architectural styles for structuring applications, with trade-offs in complexity, deployment, and scaling.
  • Request-Response vs Publish-Subscribe - two fundamental communication patterns for services, differing in coupling, scalability, and failure handling.

Suggested Reading Order

If you are new to system design, working through these topics in a structured order will help each concept build on the last.

1. Foundations. Start here to understand the vocabulary and core metrics.

  • Defining Software Architecture
  • Network Protocols
  • Latency and Throughput
  • Availability

2. Data and storage. Learn how data is stored, accessed, and distributed.

  • SQL vs NoSQL Databases
  • Specialized Storage Paradigms
  • Caching
  • Hashing
  • Replication and Sharding

3. Infrastructure and traffic. Understand how requests flow through a system.

  • Proxies
  • Load Balancing
  • Rate Limiting
  • Polling and Streaming

4. Distributed systems. Dig into the harder problems that come with scale.

  • CAP Theorem
  • Leader Election
  • Peer-To-Peer Networks

5. Architecture patterns. Tie it all together with high-level design decisions.

  • Microservices vs Monolith
  • Request-Response vs Publish-Subscribe

You do not need to follow this order strictly. If you already have experience with networking and databases, jumping straight to the distributed systems or architecture sections works fine too.

This post is licensed under CC BY 4.0 by the author.