Smallest Cache with Raku

[60] Published 1. March 2020.

This is my response to the Perl Weekly Challenge #49.

Challenge #49.1: Smallest Multiple

Write a script to accept a positive number as command line argument and print the smallest multiple of the given number consists of digits 0 and 1.



For example: For given number 55, the smallest multiple is 110 consisting of digits 0 and 1.

Brute Force

Let us try a brute force approach, where we start with the given number trying multiples of it until we get a number consisting of zeroes and ones only.

This is not the best way of doing this, so scroll down to the «Binary First» section if you don't want the scenic detour.

subset binary of Int where { /^<[ 0 1 ]>+$/ }; # [1] unit sub MAIN (Int $number, :$verbose); # [2] my $current = $number; # [3] loop # [4] { $current += $number; # [5] say ": Checking $current" if $verbose; if $current ~~ binary # [6] { say "Match: $current"; # [6a] exit; # [6b] } }

File: smallmull-add

[1] We need a way of deciding if the number consists of only zeroes and ones. A custom type, declared with «subset», is one way of doing it. The innermost part ( <[ 0 1 ]> ) is a Character Group with the digits 0 and 1 only. The + says that we require it 1 or more times, the inital ^ is an anchor to the start of the string, and $ is an anchor to the end of the string. So the pattern must contain these two digits only, at least once, and not anything else.

[2] I use the magical MAIN procedure to catch the arguments. Activate verbose mode with «--verbose» on the command line.

[3] The current value, starting with the user specified value.

[4] In an eternal loop,

[5] • Add the original value to the current one (in effect increasing the multiple by one, but addition should be cheaper (faster) than mutiplication).

[6] • Do we have a match; testing with the Smartmatch operator ~~ ? If yes, say so [6a] and exit [6b]. (Note that we could have used last instead of exit here.)

Running it:

$ raku smallmull-add 1 Match: 10 $ raku smallmull-add 2 Match: 10 $ raku smallmull-add 3 Match: 111 $ raku smallmull-add 4 Match: 100 $ raku smallmull-add 5 Match: 10 $ raku smallmull-add 6 Match: 1110 $ raku smallmull-add 7 Match: 1001 $ raku smallmull-add 8 Match: 1000 $ raku smallmull-add 9 Match: 111111111

The last one used about 28 minutes on my pc. That is not very impressive.

We can have a go at some additional values, with verbose mode:

$ raku /smallmull-add --verbose 10 : Checking 20 : Checking 30 : Checking 40 : Checking 50 : Checking 60 : Checking 70 : Checking 80 : Checking 90 : Checking 100 Match: 100 $ raku smallmull-add --verbose 11 : Checking 22 : Checking 33 : Checking 44 : Checking 55 : Checking 66 : Checking 77 : Checking 88 : Checking 99 : Checking 110 Match: 110

Note that most multiples of 9 (e.g. 9, 18, 27) are hard to compute.

We can speed it up considerably by skipping sequences that cannot match. E.g. if we get to "1020" or "1021" we skip to "1100" (or rather 1100, rounded down to the nearest multiple of the given number).

Feel free to scroll down to the «Binary First» section if you don't want the scary part of the scenic detour.

subset binary of Int where { /^+$/ }; unit sub MAIN (Int $number, :$verbose); my $current = $number; loop { $current += $number; say ": Checking $current" if $verbose; if $current ~~ binary { say "Match: $current"; exit; } next if $current.chars < 2; my @digits = $current.comb.reverse; # [1] for 0 .. @digits.end -> $index # [2] { if @digits[$index] > 1 # [2a] { @digits[$index +1]++; # [1a] @digits[0 .. $index] = (0 xx $index +1); # [2b] } } my $new = @digits.reverse.join; # [1] $current = $new - ( $new % $number ); # [3] }

File: smallmull-add-smart

The code is complicated, but it seems to works. (Unless we do comprehensive testing.)

[1] The double reverse are there so that we can get a new digit added to the number without problems [1a]. If the last one is at index 6, accessing a non-exising one at index 7 gives something that is coerced to zero when we add 1 to it. Adding new digits is normally done up front, so without the reverse the first digit would be at index 0. Accessing a new digit at index -1 would fail.

[2] Iterate over the digits (with the least significant one first). If the digit is higher than 1 [2a], increase the next (higer) digit with 1 [1a] and reset the remaining digits (from the current one to the end) to 0 [2b].

[3] Round the resulting number down to the nearest multiple of the original input.

$ raku smallmull-add-smart 9 Match: 111111111

This one used about 38 seconds. That is a very impressive speed up, but still not good enough.

Running it with verbose mode shows that it skips a lot of numbers:

$ raku smallmull-add-smart --verbose 9 : Checking 18 : Checking 108 : Checking 117 : Checking 1008 : Checking 1017 : Checking 1107 : Checking 1116 : Checking 10008 ...

Binary First

unit sub MAIN (Int $number, :$verbose); for 1 .. Inf -> $binary # [1] { my $current = $binary.base(2); # [2] say ": Checking $current" if $verbose; if ($current > $number && $current %% $number) # [3] { say "Match: $current"; exit; } }

A far better approach is starting with the binary numbers, checking if they (taken as a decimal value) are divisible by the given number: File: smallmull

[1] An infinite loop. We start with 1 (and not $number , as they are uncomparable).

[2] Convert the decimal number to a binary number.

[3] Check if the new number is divisible (with the %% Divisibility Operator). The first check is there to prevent the number itself from triggering a match.

Running it on the problematic number 9:

$ raku smallmull 9 Match: 111111111

This one used about 0.3 seconds on my pc. That is good.

It is tedious to run it manually for several values, so I wrote a version that can compute several values:

unit sub MAIN (Int $number, :$verbose, :$upto); # [1] $upto ?? ( show($_) for 2 .. $number ) # [2a] !! show($number); # [2b] sub show ($number) { for 1 .. Inf -> $binary { my $current = $binary.base(2); say ": Checking $current" if $verbose; if ($current > $number && $current %% $number) { say $upto # [3] ?? "$number Match: $current" # [3a] !! "Match: $current"; last; # [4] } } }

File: smallmull-upto

[1] Use the «--upto» named argument to activate «upto» mode.

[2] If «uptpo» mode, show the match for all values up to the specified number [2a]. If not, use the specified number only [2b].

[3] If «upto», show the number itself as well [3a].

[4] I had to use last instead of exit , so that the program doesn't terminate after the first iteration.

Running it:

$ raku smallmull-upto 9 Match: 111111111 $ raku smallmull-upto --upto 100 2 Match: 10 3 Match: 111 4 Match: 100 5 Match: 10 6 Match: 1110 7 Match: 1001 8 Match: 1000 9 Match: 111111111 10 Match: 100 11 Match: 110 12 Match: 11100 13 Match: 1001 14 Match: 10010 15 Match: 1110 16 Match: 10000 17 Match: 11101 18 Match: 1111111110 19 Match: 11001 20 Match: 100 21 Match: 10101 22 Match: 110 23 Match: 110101 24 Match: 111000 25 Match: 100 26 Match: 10010 27 Match: 1101111111 28 Match: 100100 29 Match: 1101101 30 Match: 1110 31 Match: 111011 32 Match: 100000 33 Match: 111111 34 Match: 111010 35 Match: 10010 36 Match: 11111111100 37 Match: 111 38 Match: 110010 39 Match: 10101 40 Match: 1000 41 Match: 11111 42 Match: 101010 43 Match: 1101101 44 Match: 1100 45 Match: 1111111110 46 Match: 1101010 47 Match: 10011 48 Match: 1110000 49 Match: 1100001 50 Match: 100 51 Match: 100011 52 Match: 100100 53 Match: 100011 54 Match: 11011111110 55 Match: 110 56 Match: 1001000 57 Match: 11001 58 Match: 11011010 59 Match: 11011111 60 Match: 11100 61 Match: 100101 62 Match: 1110110 63 Match: 1111011111 64 Match: 1000000 65 Match: 10010 66 Match: 1111110 67 Match: 1101011 68 Match: 1110100 69 Match: 10000101 70 Match: 10010 71 Match: 10011 72 Match: 111111111000 73 Match: 10001 74 Match: 1110 75 Match: 11100 76 Match: 1100100 77 Match: 1001 78 Match: 101010 79 Match: 10010011 80 Match: 10000 81 Match: 1111111101 82 Match: 111110 83 Match: 101011 84 Match: 1010100 85 Match: 111010 86 Match: 11011010 87 Match: 11010111 88 Match: 11000 89 Match: 11010101 90 Match: 1111111110 91 Match: 1001 92 Match: 11010100 93 Match: 10000011 94 Match: 100110 95 Match: 110010 96 Match: 11100000 97 Match: 11100001 98 Match: 11000010 99 Match: 111111111111111111 100 Match: 1000

This one used about 4.2 seconds. Pretty good.

Note how the multiples of 9 (9, 18, 27, 36, 54, 63, 72, 81, 90 and 99) stand out, as I stated earlier (without proof).

Why Binary is Faster

The initial program «smallmull-add» (where we added the number itself ad infinitum) has 12.345.679 (111111111/9) binary checks.

The amended program «smallmull-add-smart» comes out at only 255 checks (run the program in verbose mode and pipe it through «wc»), but has a lot of overhead.

The binary program «smallmull» has only 511 computations ( "111111111".parse-base(2) ), a reduction of about 99.996% compared to the initial one. That is impressive. Note that the program is faster (and easier on a code reviewer) than the second one, even if it has doubled the number of tests, as it doesn't have all the number testing and skipping.

Let us consider 9, which gives us 111111111 as answer.

Note that «smallmull-add-smart» works with 9, 18 and 27. But it does not work with e.g. 4. The program has half the checks than the binary program, and obviously misses the answer along the way. That is a serious problem. Also note that it doesn't work with 36, so it doesn't even get the multiples of 9 right. The binary version works, so I will not investigate this further. (Feel free to see if you can sort out the problem, though. You'll basically just end up with convoluted (or fancy) code to replicate the binary numbers.)

Challenge #49.2: LRU Cache

Write a script to demonstrate LRU Cache feature. It should support operations get and set. Accept the capacity of the LRU Cache as command line argument.



Definition of LRU: An access to an item is defined as a get or a set operati on of the item. “Least recently used” item is the one with the oldest access time.



For example: capacity = 3 set(1, 3) set(2, 5) set(3, 7) Cache at this point: [Least recently used] 1,2,3 [most recently used] get(2) # returns 5 Cache looks like now: [Least recently used] 1,3,2 [most recently used] get(1) # returns 3 Cache looks like now: [Least recently used] 3,2,1 [most recently used] get(4) # returns -1 Cache unchanged: [Least recently used] 3,2,1 [most recently used] set(4, 9) Cache is full, so pushes out key = 3: [Least recently used] 2,1,4 [most recently used] get(3) # returns -1 I found this amazing post talking about LRU Cache.

This cache is a good candidate for a class. I'll start with the program, and present the class afterwards:

use lib 'lib'; # [1] use Cache::LRU; # [1a] unit sub MAIN (Int $limit = 3, :$verbose); # [2] say ": Cache size: $limit" if $verbose; my $lru = Cache::LRU.new(:$limit, :$verbose); # [3] loop # [4] { given prompt "Command (h for help): " # [5] { when /^c[apacity]?$/ { say $lru.get-limit; } # [6] when /^c[apacity]? \s+ [\= \s+]? (\d+) $/ { $lru.set-limit($0.Int); } # [7] when /^s[et]?\((\d+) \, \s* (\d+) \) $/ { $lru.set($0.Int, $1.Int); } # [8] when /^s[et]? \s+ (\d+) \, \s* (\d+) $/ { $lru.set($0.Int, $1.Int); } # [8a] when /^g[et]?\((\d+) \)$/ { say $lru.get($0.Int); } # [9] when /^g[et]? \s+ (\d+)$/ { say $lru.get($0.Int); } # [9a] when 'd'|'dump' { $lru.dump if $lru.head; } # [10] when 'data' { $lru.dump2 if $lru.head; } # [11] when 'h'|'help' { help; } # [12] when 'quit'|'q' { exit; } # [13] default { say "- Illegal command. Use 'h' for help"; } # [14] } } sub help { ; # [15] }

File: lru-cache

[1] This line makes it possible to place the module in the specified location, without installing it. I have chosen a realistic name for the module [1a].

[2] The cache size defaults to 3, if not specified on the command line.

[3] Set up the Cache object. Note the colon syntax for passing on named arguments with the same variable name. ( :$verbose is short for verbose => $verbose .)

[4] An eternal loop. See [13] for the exit strategy.

[5] I could have used if / elsif / else , but given / when gives more compact code. prompt displays the specified text, and waits for user input.

[6] The command «capacity» (and the short form «c») shows the capacity (or size) of the buffer.

[7] Specify an integer value after «capacity» to change the capacity (or size) of the buffer. The «=» part is optional, so all of «capacity = 12», «capacity 12», «c = 12» and «c 12» work. (The newline here and in [9] and [10] are only there because of the article with.)

[8] Type «set(1, 11)» to add the value «11» to the cach entry with id «1». These forms are also accepted: «set(1,11)», «s(1, 11)», «set(1,11)», «s(1,11)», «set 1, 11», «s 1, 11», «set 1,11» and «s 1,11».

[9] Type «get(1)» to retrieve the value with id «1» in the cache. The «get 1» form is also accepted.

[10] Type «dump» or «d» to see the cache entries, with the most recently used first.

[11] Type «data» to show the internal Raku data structure, starting with the most recently used.

[12] Type «help» or «h» to see the help message. Note that it is placed in the program (a procedure) and not in the class (a method), as the commands are part of the program.

[13] Type «quit» or «q» to exit the program.

[14] If the tests above fails. default is the gather / take version of the else in if / elsif blocks.

[15] I have to confess that I haven't bothered writing this one...

The Module

I have made a module of the entire cache infrastructure, to encapsulate the supporting data structures (head and tail pointers, the key lookup hash).

The elements themselves have a pointer to the higher and lower cache entries, so require a class of their own. I have placed this second class inside the first one, to hide it - and prevent the program from beeing able to use it directly. It can only be accessed via the «Cache::LRU» module.

class Cache::LRU # [1] { has $.limit is rw = 3; # [2] has $.verbose = False; # [3] has %.lookup is rw = (); # [4] has $.head is rw = Any; # [5] has $.tail is rw = Any; # [6] class LRU-elem # [7] { has $.id; # [8] has $.payload; # [9] has $.higher is rw; # [10] has $.lower is rw; # [11] }

File: lib/Cache/LRU.rakumod (partial)

[1] The cache module, which is all the user should care about.

[2] The cache capacity (or maximum size). This value can be changed dynamically on a cache in use.

[3] We can enable verbose mode on cache creation. (The missing is rw means that it cannot be changed later on.) Note that I haven't really added verbose output.

[4] The hash lookup, where we get the cache object for the given key.

[5] The head of the cache; the most recently used item.

[6] The tail of the cache; the least recently used item.

[7] The class for the individual cache elemts.

[8] • an ID.

[9] • the data (payload).

[10] • a pointer to the higher element.

[11] • a pointer to the lower element.

Note the is rw on the $.higher and $.lower pointers, as they will change values when we access the cache. Also note the missing is rw on $.payload , as I have chosen to make the value read only. If we change the value of an existing cache entry, the module will delete the existing object and then create a new one.

The code handling the size of the cache:

method get-limit # [12] { return self.limit; # [12] } method set-limit (Int $new-limit) # [13] { return if $new-limit < 1; # [14] return if $new-limit == self.limit; # [15] self.limit = $new-limit; # [16] self.shrink while self.lookup.keys.elems > self.limit; # [17] say ": New Cache Capacity: { self.limit }" if self.verbose; } method shrink # [18] { self.lookup{self.tail.id} :delete; # [19] self.tail = self.tail.higher; # [20] self.tail.lower = Any; # [21] }

File: lib/Cache/LRU.rakumod (partial)

[12] Get the current limit of the cache. Note that I haven't added a method to get the current size (the number of elements in the cache).

[13] Set a new limit (as an Int) for the cache;

[14] • Do nothing if the new size is less than 2. This stops negative numbers, 0 and 1.

[15] • Do nothing if the new size if the same as the current one.

[16] Set the new limit.

[17] If the new limit is smaller than the actual number of items in the cache, remove the last one one at a time until we reach the limit.

[18] This method removes tha last entry in the cache;

[19] • Remove the last entry (the tail) from the hash.

[20] • Set the tail pointer to the second last entry.

[21] • Remove the pointer from the new tail element. The result is that the old tail has a link count of 0, and is lost.

Then an internal method to extract (and remove) a given item from the cache. The element is not destroyed, so it can be inserted again, e.g. as a new head. The method is available for users of the module through the «delete» wrapper, as that one takes care of the hash entry (ensuring data consistency).

method !extract ($elem) # [22] { $elem.higher # [23] ?? ( $elem.higher.lower = $elem.lower ) # [23a] !! ( self.head = $elem.lower ); # [23b] $elem.lower # [24] ?? ( $elem.lower.higher = $elem.higher ) # [24a] !! ( self.tail = $elem.higher ); # [24b] $elem.higher = $elem.lower = Any; # [25] return $elem; # [26] } method delete ($elem) # [27] { self.lookup{$elem.id}:delete; # [27a] return self!extract($elem); # [27b] }

File: lib/Cache/LRU.rakumod (partial)

[22] This is a private method (specified by the leading ! ), only available form inside the module itself. It extracts an element from the cache, and takes care of all the pointers (except the «lookup» hash).

[23] Does this element have a higher one? If yes, bypass the current one. If not, set the lower one as the new head.

[24] Does this element have a lower one? If yes, bypass the current one. If not, set the higher one as the new tail.

[25] Remove the pointers from the extracted element.

[26] Return it.

[27] Extract an element [27a], and remove it from the hash [27b].

The method that gets a value:

method get (Int $id) # [28] { say ": get $id " if self.verbose; my $current = self.lookup{$id}; # [29] return -1 unless $current; # [30] return $current.payload if $current === self.head; # [31] self!extract($current); # [32] self.head.higher = $current; # [33] $current.lower = self.head; # [34] self.head = $current; # [35] return $current.payload; # [36] }

File: lib/Cache/LRU.rakumod (partial)

[28] Note the Int restriction on the ID. That complies with the challenge, but may be too restrictive.

[29] Get the element, given the ID.

[30] Return -1 if it doesn't exist.

[31] Return the value (the payload) if the element already is the head.

[32] Extract the element.

[33-35] Insert the «$current» element as the new head, before the current values:

[33] Set the current head's head to point to this one.

[34] Set this element's lower to point to the current head.

[35] Set the head to point to the current one.

[36] Return the value (the payload).

method set (Int $id, $value) # [37] { say ": set $id @ $value" if self.verbose; my $old-head = self.head; # [38] self.head = LRU-elem.new(id => $id, payload => $value); # [39] self.head.lower = $old-head; # [40] $old-head.higher = self.head if $old-head; # [41] if self.lookup{$id} # [42] { self!extract(self.lookup{$id}); # [43] } else # [44] { self.tail = self.head unless self.tail; # [45] self.shrink if self.lookup.keys == self.limit; # [46] } self.lookup{$id} = self.head; # [47] } }

File: lib/Cache/LRU.rakumod (partial)

[37] Note the Int restriction on the ID.

[38] Take note of the current head.

[39] Create a new object, and set it as the new head.

[40] Set the new object's lower to point to the old head.

[41] Set the old head's upper to pint to the new head.

[42] If the object exist already;

[43] • remove that one.

[44] If not (i.e. it is a new ID);

[45] This one takes care of the very first element added to an empty cache, as the single element is both the head and tail.

[46] Remove the tail element if we have one element too many in the cache. (We haven't added the new one to the «lookup» hash yet, so we check for the limit.)

[47] Set up the «lookup» entry.

And finally the methods showing the data structure:

method dump ($elem = self.head) # [48] { print $elem.id ~ " -> " ~ $elem.payload; # [49] print " (head)" if $elem === self.head; # [49a] print " (tail)" if $elem === self.tail; # [49b] say ""; self.dump($elem.lower) if $elem.lower; # [50] } method dump2 ($elem = self.head) # [51] { say $elem; # [52] }

File: lib/Cache/LRU.rakumod (partial)

[48] A recursive method, that starts with the head and prints the ID and value (payload) for each cache entry all the way to the tail.

[49] The output. Prefix with a note if ut is the head or tail (or both).

[50] The recursive part.

[51] Dump the data structure as a Raku data structure.

[52] This one does the job.

Running it:

$ raku lru-cache Command (h for help): s 1,11 Command (h for help): d 1 -> 11 (head) (tail) Command (h for help): s 2,22 Command (h for help): d 2 -> 22 (head) 1 -> 11 (tail) Command (h for help): s 3,33 Command (h for help): d 3 -> 33 (head) 2 -> 22 1 -> 11 (tail) Command (h for help): s 4,44 Command (h for help): d 4 -> 44 (head) 3 -> 33 2 -> 22 (tail) Command (h for help): s 4,444 Command (h for help): d 4 -> 444 (head) 3 -> 33 2 -> 22 (tail) Command (h for help): s 3,333 Command (h for help): d 3 -> 333 (head) 4 -> 444 2 -> 22 (tail) Command (h for help): s 2,222 Command (h for help): d 2 -> 222 (head) 3 -> 333 4 -> 444 (tail) Command (h for help): c 3 Command (h for help): c 2 Command (h for help): d 2 -> 222 (head) 3 -> 333 (tail) Command (h for help): c 3 Command (h for help): s 9,999 Command (h for help): d 9 -> 999 (head) 2 -> 222 3 -> 333 (tail) Command (h for help): data (my \Cache::LRU::LRU-elem_94053912773264 = Cache::LRU::LRU-elem.new (id => 9, payload => 999, higher => Any, lower => (my \Cache::LRU::LRU-elem_94053912773200 = Cache::LRU::LRU-elem.new (id => 2, payload => 222, higher => Cache::LRU::LRU-elem_94053912773264, lower => Cache::LRU::LRU-elem.new (id => 3, payload => 333, higher => Cache::LRU::LRU-elem_94053912773200, lower => Any ) ) ) ) ) Command (h for help): q

I have inserted a liberal amount of newlines and indentation on the last one.

Running it with the input given in the challenge (using the exact verbose syntax):

$ raku lru-cache Command (h for help): capacity 3 Command (h for help): set(1, 3) Command (h for help): set(2, 5) Command (h for help): set(3, 7) Command (h for help): d 3 -> 7 (head) 2 -> 5 1 -> 3 (tail) Command (h for help): get(2) 5 Command (h for help): d 2 -> 5 (head) 3 -> 7 1 -> 3 (tail) Command (h for help): get(1) 3 Command (h for help): d 1 -> 3 (head) 2 -> 5 3 -> 7 (tail) Command (h for help): get(4) -1 Command (h for help): d 1 -> 3 (head) 2 -> 5 3 -> 7 (tail) Command (h for help): set(4,9) Command (h for help): d 4 -> 9 (head) 1 -> 3 2 -> 5 (tail) Command (h for help): get(3) -1 Command (h for help): q

And that's it.