The Raku Instance Bar

[40] Published 2. November 2019

This is my response to the Perl Weekly Challenge #32.

Challenge 32.1: Count Instances

Contributed by Neil Bowers



Create a script that either reads standard input or one or more files specified on the command-line. Count the number of times and then print a summary, sorted by the count of each entry.



So with the following input in file example.txt



apple banana apple cherry cherry apple the script would display something like:



apple 3 cherry 2 banana 1 For extra credit, add a -csv option to your script, which would generate:



apple,3 cherry,2 banana,1

I'll start with a file only version, and look into the standard input support later:

unit sub MAIN ($file where $file.IO.f && $file.IO.r = "example.txt"); # [1] ############################### # [2] ######## my %input = $file.IO.lines.Bag; # [3] say "$_\t%input{$_}" for %input.keys.sort( { %input{$^b} <=> %input{$^a} }); # [6] ############# # [4] ######### # [5] ################################

File: line-counter-file

[1] The input is a filename, and the file must exists ( IO.f ) and be readable ( IO.r ).

[2] If we don't specify a file name, use the default value.

[3] We read in the entire file ( .IO.lines ), and turn the resulting array into a Bag. A Bag is a version of a hash, where the values in the array end up as the keys, and the values are the number of times they occured in the array. So we don't have to count them ourselves.

[4] We iterate over the keys (in the Bag).

[5] A Bag is an unsorted data structure, so the default behaviour is returning the keys in random order. I have chosen to sort them on the values (the counters). I have reversed the placeholder variables ( $^a and $^b ) to get the largest value first.

[6] Print the keys and values, with a tab added for cheap tabulation. Note that this will not work out if the keys differ greatly in size.

See docs.perl6.org/type/Bag for more information about the Bag type.

Running it, with the sample file given in the challange (either explicitly or implicitly), gives the expected result:

$ raku line-counter example.txt apple 3 cherry 2 banana 1 $ raku line-counter apple 3 cherry 2 banana 1

Standard Input

$*ARGFILES

my %input = $*ARGFILES.lines.Bag; say "$_:\t%input{$_}" for %input.keys.sort( { %input{$^b} <=> %input{$^a} });

This is quite straightforward with thespecial (dynamic) variable: File: line-counter-argfiles

Running it:

$ raku line-counter-argfiles example.txt apple: 3 cherry: 2 banana: 1 $ cat example.txt example.txt | raku line-counter-argfiles apple: 6 cherry: 4 banana: 2

$*ARGFILES gives us whatever is entered on standard input, if any, and the content of the files specified on the command line otherwise. If we specify neither, it waits for user input which we can enter until we press «Control-d». This isn't very user friendly.

Note that if we wrap $*ARGFILES in a MAIN procedure, it will ignore any files specified on the command line, and insist on standard input. See docs.perl6.org/type/IO::ArgFiles#$*ARGFILES for details. This was added in Raku version 6.d, so earlier versions (which you definitely shouldn't use) treats $*ARGFILES the same in- and outside of MAIN.

In addition to the seemingly mute mode (when we don't specify a file, and don't pipe standard in), we got rid of the default filename when we ditched MAIN. Let's rectify both:

multi sub MAIN ($file where $file.IO.f && $file.IO.r = "example.txt") { say "[file]"; my %input = $file.IO.lines.Bag; say "$_:\t%input{$_}" for %input.keys.sort( { %input{$^b} <=> %input{$^a} }); } multi sub MAIN () { say "[argfiles]"; my %input = $*ARGFILES.lines.Bag; say "$_:\t%input{$_}" for %input.keys.sort( { %input{$^b} <=> %input{$^a} }); }

File: line-counter-main

Yes, the program is silly as it replicates way too much code. But it doesn't work as intended, and that is a greater concern. The second «multi MAIN» is never called. That is because the first one accepts no values, and uses the default filename in that case.

The solution is probably to go back to «line-counter-argfiles», and accept the problem of no default filename. (And we can probably get away with that, as the challenge don't specify what do to in such a situation.)

CSV

Except that the CSV option screams out for a «MAIN» to handle the optional argument...

So let's do that, but factor out the duplicated code to a new procedure:

multi sub MAIN ($file where $file.IO.f && $file.IO.r, :$csv = False) { line-counter($file.IO.lines.Bag, $csv); } multi sub MAIN (:$csv = False) { line-counter($*ARGFILES.lines.Bag, $csv); } sub line-counter (%input, $csv) { my $separator = $csv ?? "," !! "\t"; say "$_$separator%input{$_}" for %input.keys.sort( { %input{$^b} <=> %input{$^a} }); }

File: line-counter-csv

Running it:

$ raku line-counter-csv example.txt apple 3 cherry 2 banana 1 $ raku line-counter-csv --csv example.txt apple,3 cherry,2 banana,1 $ raku line-counter-csv -csv example.txt apple,3 cherry,2 banana,1 $ cat example.txt example.txt | raku line-counter-csv apple 6 cherry 4 banana 2 $ cat example.txt example.txt | raku line-counter-csv -csv apple,6 cherry,4 banana,2 $ raku line-counter-csv ^C

Note that named Boolean arguments can be specified with either one or two leading hypens; e.g. «-csv» and «--csv».

The last example seemingly hangs as we haven't specified any files or given it any input. So I killed the program with «Control-c».

Sorted, but not sorted

apple banana apple cherry cherry apple junkfood junkfood

$ raku line-counter-csv example2.txt apple 3 junkfood 2 cherry 2 banana 1 $ raku line-counter-csv example2.txt apple 3 cherry 2 junkfood 2 banana 1

Note that the nice sort by value will not know what to do if we have two or more identical values, so they will come out in random order. As shown here: File: example2.txt

The cheap tabulation looks really cheap here, so I'll take a look at that as well:

multi sub MAIN ($file where $file.IO.f && $file.IO.r, :$csv = False) { line-counter($file.IO.lines.Bag, $csv); } multi sub MAIN (:$csv = False) { line-counter($*ARGFILES.lines.Bag, $csv); } sub line-counter (%input, $csv) { my $max = %input.keys>>.chars.max; # [1] ######### # [1a] #### # [1b] # [1c] for %input.keys.sort({ %input{$^b} <=> %input{$^a} || $^a cmp $^b }) # [2] # [2a] #################### # [2b] #### { say $csv # [3] ?? "$_,%input{$_}" # [3a] !! "{ $_ }{ " " x ($max - .chars) } { %input{$_} }"; # [3b] } }

File: line-counter-csv-fixed

[1] The length of the longest identifier. We start with all the keys (aa), then apply «chars» on each element in the list (ab), and collapse it into a single value with «max» (ac). Note that >>. invokes a method on all the elements separately, and not on the whole list as with a normal . call.

[2] If the first part (2a) gives equal, we go on to the second part (2b).

[3] If csv, use a comma (3a). If not, pad with spaces (3b).

See docs.perl6.org/routine/x for information about the String repetition operator «x», used here to get the padding.

Running it to show that it works as intended:

$ raku line-counter-csv-fixed example2.txt apple 3 cherry 2 junkfood 2 banana 1 $ raku line-counter-csv-fixed example2.txt apple 3 cherry 2 junkfood 2 banana 1

Challenge 32.2: ASCII bar chart

Contributed by Neil Bowers



Write a function that takes a hashref where the keys are labels and the values are integer or floating point values. Generate a bar graph of the data and display it to stdout.



The input could be something like:



$data = { apple => 3, cherry => 2, banana => 1 }; generate_bar_graph($data); And would then generate something like this:



apple | ############ cherry | ######## banana | #### If you fancy then please try this as well: (a) the function could let you specify whether the chart should be ordered by (1) the labels, or (2) the values.

Let's dive straight in. The challenge gives part of the code, which works equally well in Perl and Raku:

my $data = { apple => 3, cherry => 2, banana => 1 }; generate_bar_graph($data); sub generate_bar_graph ($data) { my $max = %($data).keys>>.chars.max; for %($data).kv -> $label, $count { say "{ " " x ($max - $label.chars) }$label | { "#" x 4 * $count }"; } }

File: abc-unsorted

Running it:

$ raku abc-unsorted banana | #### cherry | ######## apple | ############ $ raku abc-unsorted banana | #### apple | ############ cherry | ######## $ raku abc-unsorted cherry | ######## banana | #### apple | ############

Sorted output

unit sub MAIN (Str :$sort where $sort eq any("", "values", "labels") = ""); # [1] my $data = { apple => 3, cherry => 2, banana => 1 }; generate_bar_graph($data, $sort); # [2] sub generate_bar_graph ($data, $sort) # [2] { my $max = %($data).keys>>.chars.max; # [3] my @keys = %($data).keys; # [4] if $sort eq "values" # [5] { @keys = @keys.sort({ %($data){$^b} cmp %($data){$^a} }); } elsif $sort eq "labels" # [6] { @keys = @keys.sort; } for @keys -> $label # [7] { say "{ " " x ($max - $label.chars) }$label | { "#" x 4 * %($data){$label} }"; } }

The data is stored in a hash, so the output order is random. I didn't add sortings (as done in challenge 32.1) as this challenge asked for it as a bonus. But I'll do that now: File: abc

[1] Use the named parameter «sort» to set thesort order. The values are «values» and «labels» (in addition to an empty string for random order). I use a junction ( any ) to list the legal values, and the assignment is the default value (which is used if we don't specify one on the command line).

[2] We pass on the sort parameter.

[3] The length of the longest identifier. See note #3 of «line-counter-csv-fixed» above for a detailed explanation.

[4] Get the keys, in random order. (And note that the line above could be simplified to my $max = @keys>>.chars.max; if we move it after this line.)

[5] Sort by values, if requested.

[6] Sort by labels, if requested.

[7] We use the length of the longest identifier to prexit the labels with spaces, giving a right justified column.

Running it:

$ raku abc --sort=values apple | ############ cherry | ######## banana | #### $ raku abc --sort=values apple | ############ cherry | ######## banana | #### $ raku abc --sort=labels apple | ############ banana | #### cherry | ######## $ raku abc --sort=labels apple | ############ banana | #### cherry | ######## $ raku abc cherry | ######## banana | #### apple | ############ $ raku abc banana | #### apple | ############ cherry | ######## $ raku abc --sort=foo Usage: abc [--sort= ]

We should test floating point numbers:

my $data = { apple => pi, cherry => e, banana => 0.3, junkfood => 0.6 };

File: abc-float (changes only):

Running it:

$ raku abc-float cherry | ########## banana | # apple | ############ junkfood | ##

«pi» and «e» are numerical constants. Try this in REPL mode if you want to see the actual values:

$ raku To exit type 'exit' or '^D' > pi 3.141592653589793 > e 2.718281828459045

And that's it.