Caching
Caching is utilized to speed up a system and reduce the latency of that system.
Different operations or data transfers take varying amounts of time. Caching is a way to design a system so that it avoids time-consuming operations or data transfers, such as network requests, and instead relies on faster alternatives.
To put it more simply, caching involves storing data in a location different from its original source, allowing for quicker access to the data from this new location.
Caching can be implemented in various parts of a system:
Client-level caching: By caching data at the client level, the need for the client to request data from the server is eliminated, reducing latency.
Server-level caching: In some cases, the client may need to interact with the server, but the server doesn’t always need to query the database to retrieve data. In this scenario, caching data at the server level can help improve the system’s performance.
Intermediate caching: Caching can also be placed between two components in the system, such as between a server and a database. This approach can further optimize the system by reducing the need for direct communication between components.
Types of Caching: In-memory vs External
In-memory Caching stores data within the same operating system process as the consuming application. This approach provides low-latency access and is ideal for high-speed data access and smaller cache sizes.
External Caching manages the cache in a separate operating system process, either locally or remotely. Solutions like Redis or Memcached offer more scalability and flexibility, making this approach suitable for larger cache sizes, distributed systems, or shared caching across multiple application instances.
Content Delivery Network
When discussing caching, it’s important to mention Content Delivery Networks (CDNs). A CDN is a third-party service that functions as a cache for your web servers. Web applications may experience slow performance for users in specific regions if your servers are located far from those users. CDNs have servers distributed globally, ensuring lower latency compared to your own servers. These CDN servers are commonly referred to as PoPs (Points of Presence). Some popular CDNs include Cloudflare and Google Cloud CDN.
When may caching be helpful?
Speed up network operations and avoid redundancy
Caching can be particularly helpful when you want to speed up network operations and avoid performing the same request multiple times. By caching the results of these requests, you can quickly serve the data without accessing the database repeatedly, leading to more efficient network operations.Accelerate computationally intensive operations
In situations where a server performs a computationally expensive operation in response to a client’s request, caching can be beneficial in speeding up the process. By caching the results of these resource-intensive tasks, you can prevent the need to perform them multiple times, ultimately improving the system’s performance.Manage operations that occur frequently
Caching is also advantageous when managing operations that are performed very frequently. By using caching, you can for example prevent overloading the database with numerous redundant requests. Although individual network requests might still be relatively fast, caching can help distribute the cached data among multiple clients, reducing the overall stress on the database.
Cache Invalidation
Cache invalidation is a crucial aspect of caching, as it ensures that the data served from the cache remains accurate and up-to-date. In scenarios where data is stored in multiple locations, such as a database and a server-side cache, maintaining consistency between these two sources of truth is essential.
Consider an example where you store posts in a database and also cache them on the server-side. In this case, both the database and the cache contain the same information, creating two sources of truth. The challenge lies in keeping these sources in sync when the data changes.
When a post is updated or deleted, the cached version needs to be invalidated or refreshed to maintain consistency between the cache and the database. There are several cache invalidation strategies to tackle this issue:
- Write-through: Write data to both the cache and the primary storage simultaneously on every write operation. This approach ensures that the cache is always up-to-date, but it may lead to increased latency for write operations due to waiting for both cache and primary storage updates.
- Advantages: Fast retrieval, complete data consistency, robust to system disruptions
- Disadvantages: Higher latency for write operations.
- Write-back: Write data to the cache first and asynchronously update the primary data storage later. This strategy allows for faster write operations by reducing wait times associated with primary storage updates. However, it may introduce the risk of data loss or inconsistency in case of a system failure before the data is synchronized with the primary storage.
- Advantages: Faster write operations (by reducing wait times associated with primary storage updates), improved write performance, and reduced load on primary storage.
- Disadvantages: Risk of data loss or inconsistency in case of a system failure before synchronization, potential stale data in primary storage, and added complexity for managing data synchronization.
- Write-around: Write data directly to the primary storage, bypassing the cache entirely. The corresponding cache entry (if one exists) is invalidated so that the next read triggers a fresh fetch from the database. This strategy avoids flooding the cache with data that may not be read again soon. The trade-off is that the first read after a write will always be a cache miss.
- Advantages: Prevents cache pollution from infrequently read data, simpler write path.
- Disadvantages: Higher read latency on the first access after a write (guaranteed cache miss).
- Cache-aside (Lazy Loading): The application itself manages the cache rather than relying on the cache layer to intercept reads or writes automatically. On a read, the application first checks the cache. If the data is present (cache hit), it is returned directly. If the data is absent (cache miss), the application reads from the primary storage, stores the result in the cache, and then returns it. On a write, the application updates the primary storage and either invalidates or updates the corresponding cache entry.
- Advantages: Only requested data is cached (no unnecessary cache population), straightforward to implement, and works well with any data store.
- Disadvantages: Initial requests always result in cache misses (cold start), and the application must handle the cache-management logic explicitly.
By implementing an appropriate cache invalidation strategy, you can maintain data consistency between the cache and the primary data source, ensuring that your system serves accurate and up-to-date information.
Time-to-Live (TTL)
Regardless of which caching strategy you choose, setting a Time-to-Live (TTL) on cached entries is an essential practice. A TTL defines how long a cached value remains valid before it is automatically expired and evicted.
Setting the right TTL requires balancing data freshness against performance. A short TTL (e.g., a few seconds) ensures that stale data is served only briefly, which is important for rapidly changing data like stock prices or session tokens. A longer TTL (e.g., minutes or hours) reduces the load on the primary data store and improves cache hit rates, which works well for data that changes infrequently, such as product catalog information or user profile images.
In practice, many systems use a combination of TTLs across different data types. TTLs also serve as a safety net: even if an explicit invalidation is missed due to a bug or a race condition, the stale entry will eventually expire on its own.
Cache Eviction Policy
A cache eviction policy determines how values are removed from a cache when it reaches capacity. Eviction is necessary to maintain cache efficiency, free up space for new data, and ensure that the most relevant information is stored. The most commonly used eviction policies are:
LRU (Least Recently Used): Evicts the entry that has not been accessed for the longest time. LRU works well for workloads where recently accessed data is likely to be accessed again soon. It is the default eviction policy in many caching systems, including Redis and Memcached.
LFU (Least Frequently Used): Evicts the entry with the fewest total accesses. LFU favors data that is accessed consistently over time, even if it was not accessed very recently. It works well when some items are “popular” and should remain cached regardless of short gaps between accesses. The downside is that entries with historically high access counts can become stale yet remain cached for a long time.
FIFO (First In, First Out): Evicts the oldest entry in the cache, regardless of how recently or frequently it was accessed. FIFO is the simplest policy to implement and has predictable behavior, but it does not consider access patterns at all. It can be a reasonable choice when all cached items have roughly equal likelihood of being accessed.
Each policy offers a different approach to prioritizing which data should be evicted. Choosing the right one depends on the access patterns of your application. In many cases, LRU provides a good default balance, but workloads with distinct hot and cold data may benefit from LFU, while simple time-bounded caches may work well with FIFO.