Posted 13 January, 2012 by opensas in Bash, Python. Tagged: Bash, Python, Scala, scripting. 5 Comments

I’ve received a good ammount of positive feedback on my previous article on scala.

A couple of readers prefered the bash one-liner version, and many of them argued that for such a simple task it was preferable a bash or python script. Luckily all of them understood that this was just a (maybe lousy, I admmit) excuse to give scala a try, and talk a little bit about functional programming, type inference, interacting with java, higher order functions, and, well, scala itself.

Nevertheless, to make justice to bash and scala, I took some advices from the discussion at hacer news, and even though I’m no bash nor python expert, with some googling around I managed to reproduce the funcionality of the scala script.

Well, here’s the bash version:

total_size=$(du --summarize *.textile --total | tail -n 1 | cut -f 1) translated_files=$(grep -L "Esta página todavía no ha sido traducida al castellano" *.textile) translated_size=$(echo $translated_files | tr '

' '\0' | xargs -0 du --summarize --total | tail -n 1 | cut -f 1) translated_percent=$(($translated_size*100/$total_size)) echo "translated size: ${translated_size}kb/${total_size}kb ${translated_percent}% \ (pending $(($total_size-$translated_size))kb $((100-$translated_percent))%)" total_count=$(ls *.textile | wc -l) translated_count=$(echo $translated_files | tr ' ' '

' | wc -l) translated_percent=$(($translated_count*100/$total_count)) echo "translated files: ${translated_count}/${total_count} $(($translated_count*100/$total_count))% \ (pending $(($total_count-$translated_count)) $((100-$translated_percent))%)"

I just had to read a couple of man pages and struggle a little bit with tr, wc, xargs, tail, cut and that sort of stuff.

#! /usr/bin/env python # -*- coding: utf-8 -*- import fnmatch import os total_files = [file for file in os.listdir('.') if fnmatch.fnmatch(file, '*.textile')] translated_files = [file for file in total_files if "Esta página todavía no ha sido traducida al castellano" not in open(file).read()] total_size = sum([os.path.getsize(file) for file in total_files]) / 1000 translated_size = sum([os.path.getsize(file) for file in translated_files]) / 1000 translated_percent= translated_size * 100 / total_size print "translated size: %dkb/%dkb %d%% (pending %dkb %d%%)" % \ (translated_size, total_size, translated_percent, total_size-translated_size, 100-translated_percent) total_count=len(total_files) translated_count=len(translated_files) translated_percent= translated_count * 100 / total_count print "translated files: %d/%d %d%% (pending %d %d%%)" % \ (translated_count, total_count, translated_percent, total_count-translated_count, 100-translated_percent)

What else can I say? The python version was really easy.

Scala, Bash and Python… FIGHT!

Well, now let’s see the output of each version:

sas@ubuntu:~/devel/apps/playdoces/documentation/1.2.4/manual$ ./status.scala translated size: 407kb/624kb 65% (pending 217kb 35%) translated files: 37/64 57% (pending 27 43%) sas@ubuntu:~/devel/apps/playdoces/documentation/1.2.4/manual$ ./status.sh translated size: 476kb/752KB 63% (pending 276kb 37%) translated files: 37/64 57% (pending 27 43%) sas@ubuntu:~/devel/apps/playdoces/documentation/1.2.4/manual$ ./status.py translated size: 407kb/624kb 65% (pending 217kb 35%) translated files: 37/64 57% (pending 27 43%)

It seems like du rounds up the files size, but apart from that everything works as expected.

While the scala version do have a startup penalty, with the savecompiled option turned on, the delay is pretty bearable (without it the compiling process takes a little less than two seconds). Moreover, with long running or more complex tasks, I suspect that the benefits of having a compiled script, and the performance optimizations of the JVM, would certainly show up.

Here are some figures to compare.

sas@ubuntu:~/devel/apps/playdoces/documentation/1.2.4/manual$ time ./status.scala translated size: 407kb/624kb 65% (pending 217kb 35%) translated files: 37/64 57% (pending 27 43%) real 0m0.475s user 0m0.388s sys 0m0.056s sas@ubuntu:~/devel/apps/playdoces/documentation/1.2.4/manual$ time ./status.sh translated size: 476kb/752KB 63% (pending 276kb 37%) translated files: 37/64 57% (pending 27 43%) real 0m0.045s user 0m0.004s sys 0m0.008s sas@ubuntu:~/devel/apps/playdoces/documentation/1.2.4/manual$ time ./status.py translated size: 407kb/624kb 65% (pending 217kb 35%) translated files: 37/64 57% (pending 27 43%) real 0m0.039s user 0m0.020s sys 0m0.012s

After playing a bit with all three of them, for this kind of tasks I’d definitely go with python. It’s really a joy to use, it’s got great documentation and there’s lot of interesting information at stack overflow. Moreover, like the scala version, and unlike bash, is portable across different platforms, I haven’t tried it but it should work just fine on windows.

Nevertheless I expect to keep playing with scala, for learning purposes and just to have some fun…

In the next article, I give scala another chance, and at the same time have a look at Implicit conversions, Scala’s answer to ruby’s open classes.