How hard could it be to implement a LRU cache in python?

A reasonable high performance hash table, check

The bookkeeping to track the access, easy.

Here is an naive implementation of LRU cache in python:

class LRUCache : def __init__ ( self , capacity ) : self . capacity = capacity self . tm = 0 self . cache = { } self . lru = { } def get ( self , key ) : if key in self . cache : self . lru [ key ] = self . tm self . tm += 1 return self . cache [ key ] return - 1 def set ( self , key , value ) : if len ( self . cache ) >= self . capacity : old_key = min ( self . lru . keys ( ) , key = lambda k : self . lru [ k ] ) self . cache . pop ( old_key ) self . lru . pop ( old_key ) self . cache [ key ] = value self . lru [ key ] = self . tm self . tm += 1

We use cache to store the (key, value) mapping, and lru and automatic incremented tm to track the access history, pretty straightforward, right?

It turns out this implementation performs poorly in a more realistic test case, and here is the profiling result:

python -m cProfile lru-cache-test.py naive-lru-cache .. . .. . lrucache took 1.478 sec 4180120 function calls in 1.500 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno ( function ) 1 0.024 0.024 1.500 1.500 lru-cache-test.py:1 ( < module > ) 1 0.000 0.000 0.000 0.000 lru-cache-test.py:10 ( __exit__ ) 12558 0.006 0.000 1.458 0.000 lru-cache-test.py:25 ( set ) 7941 0.003 0.000 0.007 0.000 lru-cache-test.py:28 ( get ) 20499 0.002 0.000 0.002 0.000 lru-cache-test.py:32 ( < lambda > ) 1 0.000 0.000 0.000 0.000 lru-cache-test.py:5 ( Timer ) 1 0.000 0.000 0.000 0.000 lru-cache-test.py:6 ( __enter__ ) 7941 0.004 0.000 0.004 0.000 naive-lru-cache.py:15 ( get ) 12558 0.022 0.000 1.452 0.000 naive-lru-cache.py:22 ( set ) 4098048 0.606 0.000 0.606 0.000 naive-lru-cache.py:25 ( < lambda > ) 1 0.000 0.000 0.000 0.000 naive-lru-cache.py:6 ( < module > ) 1 0.000 0.000 0.000 0.000 naive-lru-cache.py:8 ( LRUCache ) 1 0.000 0.000 0.000 0.000 naive-lru-cache.py:9 ( __init__ ) 1 0.007 0.007 0.007 0.007 { __import__ } 1 0.003 0.003 0.005 0.005 { filter } 12559 0.001 0.000 0.001 0.000 { len } 1 0.000 0.000 0.000 0.000 { method 'disable' of '_lsprof.Profiler' objects } 2001 0.025 0.000 0.025 0.000 { method 'keys' of 'dict' objects } 4002 0.001 0.000 0.001 0.000 { method 'pop' of 'dict' objects } 2001 0.797 0.000 1.403 0.001 { min } 2 0.000 0.000 0.000 0.000 { time.clock }

It shows that the significant CPU time, 1.403 out of 1.478 is spent on the min operation, more concretely, this statement:

old_key = min ( self . lru . keys ( ) , key = lambda k : self . lru [ k ] )

We naively identify the least-recently-used item by a linear search with time complexity O ( n ) O(n) O(n) instead of O ( 1 ) O(1) O(1), a clear violation of the set ’s requirement.

In the contrast of the traditional hash table, the get and set operations are both write operation in LRU cache. The timestamp is mere the order of the operation. So an ordered hash table, aka OrderedDict, might be able to meet our needs. Here is the LRU cache implementation based on OrderedDict:

import collections class LRUCache : def __init__ ( self , capacity ) : self . capacity = capacity self . cache = collections . OrderedDict ( ) def get ( self , key ) : try : value = self . cache . pop ( key ) self . cache [ key ] = value return value except KeyError : return - 1 def set ( self , key , value ) : try : self . cache . pop ( key ) except KeyError : if len ( self . cache ) >= self . capacity : self . cache . popitem ( last = False ) self . cache [ key ] = value

The implementation is much cleaner as all the order bookkeeping is handled by the OrderDict now. For each get and set operation, we first pop the item, then insert back to update its timestamp. The element in the head of sequence is the least-used-item, thus the candidate to expire if the maximum capacity is reached. Here is the profiling result for the sake of comparison: