LRU Cache

Least recently used (LRU) eviction policy defines the eviction criteria to control the number of entries present in a cache.

Caching in computer science represents a technique to keep frequently used items in low latency memory compared to high latency disks or databases. This allows us to retrieve those items quickly when required and thus improve the application’s overall performance.

But due to the limited resources available, we can only keep a finite number of elements in memory at any given point in time. This requires us to decide on an eviction policy, which will remove the cached items when there is no more space left to add new items.

The two standard eviction policies that are used are:

  1. Most Recently Used
  2. Least Recently Used

The most recently used eviction policy removes the most recently accessed item from the cache. The idea is based on the very little probability of an item being re-accessed soon.

Contrary to this, the least recently used eviction policy removes the item that has not been accessed for the longest time. LRU by design targets the elements which might have been faulted and hence not used for the longest time.

Use Cases for LRU eviction policy-based cache

  1. LRU cache is used as a standard Page replacement algorithm.
  2. It is also implemented within web browsers to maintain a local store.
  3. LRU cache is also used in some implementations of data compression algorithms like LZW.
  4. Disk drivers and controllers.

LinkedHashMap and Access Order

We can use LinkedHashMap, which on a high level, can be seen as a HashMap and a doubly-linked list to implement the LRU cache. The default constructor of the LinkedHashMap maintains the insertion order of the entries. Even though the doubly-linked list adds up to the total space used by this map implementation (compared to using a single linked list), it provides more efficient updates and deletes.

The doubly-linked list (maintained by before and after references for each entry) ensures that it’s just a matter of updating the respective references in-case of any insertions or deletions. You can refer to this StackOverFlow link for more details.

But how come LinkedHashMap maintains the access order? For retaining the access order, LinkedHashMap provides a particular constructor with an additional boolean parameter to denote access order. Every time an entry is accessed, it is pushed to the tail of the doubly-linked list. This makes the head of the doubly-linked list pointing to the record that was not used for the longest time.

Consider a LinkedHashMap with 5 entries {1,2,3,4,5}:

Least Recent Used

As clear from the illustration mentioned above, whenever an element is accessed, it’s pushed to the tail. As 1 was added as the first element, it is the item that is not accessed for the longest time and hence will be the candidate of choice in case an eviction is required.

LinkedHashMap as LRU Cache

LinkedHashMap provides a protected method, removeEldestEntry, which returns a boolean value. After every put and putAll invocation, the method is invoked to see if we need to evict the eldest entry.

protected boolean removeEldestEntry(Map.Entry<K,V> eldest) {
   return false;
}

This method does not itself remove any entry and just returns a boolean value to indicate the same. The default implementation always returns false, and we need to override this method to make it work like the LRU cache.

To do so, we override this method by creating a subclass of LinkedHashMap that only allows a fixed number of entries. We override the removeEldestEntry method to return a true value if the map size exceeds the maxAllowedEntries.

public class LruCache<K, V> extends LinkedHashMap<K, V> {

   private final int MAX_ALLOWED_SIZE;
   private static final float LOAD_FACTOR = 0.75f;

   public LruCache(int maxAllowedSize) {
       super(maxAllowedSize, LOAD_FACTOR, true);
       MAX_ALLOWED_SIZE = maxAllowedSize;
   }

   @Override
   protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
       return this.size() > this.MAX_ALLOWED_SIZE;
   }
}

The following spock test cases can be used to test our implementation:

def "test the fixed size cache does not exceed its pre-defined size"(){
   given: "the cache is initialized with a fixed size"
   def size = 5
   LruCache<String, Integer> cache = new LruCache<>(size)

   when: "elements more than the predefined size are added"
   cache.put("key1", 1)
   cache.put("key2", 2)
   cache.put("key3", 3)
   cache.put("key4", 4)
   cache.put("key5", 5)
   cache.put("key6", 6)

   then: "the cache does not grow past the max allowed size"
   cache.size() == size
}

def "the least recently accessed is kicked out in-case of an overflow"(){
   given: "the cache is initialized with a fixed size"
   def size = 5
   LruCache<String, Integer> cache = new LruCache<>(size)
   and: "elements are added"
   cache.put("key1", 1)
   cache.put("key2", 2)
   cache.put("key3", 3)
   cache.put("key4", 4)
   cache.put("key5", 5)

   when: "two elements are accessed"
   cache.get("key1")
   cache.get("key2")
   and: "a new element is added that will cause an overflow"
   cache.put("key6", 6)

   then: "the element with key: key3 is evicted as it in the front of the queue"
   cache.get("key3") == null
}

As always the code mentioned above and the corresponding test cases are available in the github repository.

Reference

Be notified of new posts. Subscribe to the RSS feed.