Many unix utilities like sed, awk and grep provide powerful ways to manipulate text. But I always need to dig through the man pages and tutorials before I can do anything with them.

This morning, I needed to remove all the empty lines from a text file. Searching for ways to do this using unix tools turned up a few options:

awk 'NF' input.txt sed -i '/^$/d' input.txt grep -v '^$' input.txt

Remembering how to use these tools is always a challenge, so I decided to look at how to do this in Ruby. Ruby allows us to pass one-liner scripts from the command line, which lets us use it in the same way we would use awk.

Before we try replacing sed or awk with Ruby, let’s look at how we can run simple Ruby one-liners from the command line. For example:

$ ruby -e 'puts 42' 42

Running this prints “42” to the console, as you might have guessed. The -e flag tells Ruby to read the script from the command line, and therefore executes puts 42 .

Next, let’s look that the -n flag which lets you pipe in text to Ruby, and execute some code for each line of text.

$ echo 'foo' | ruby -n -e 'puts $_.upcase' FOO

$_ is a special variable that contains the last line read from STDIN. In this case, it prints out ‘FOO’. This also works with multiple lines of input. Say we have a file foo.txt with the words foo, bar and baz on each line:

$ touch foo.txt $ echo 'foo' >> foo.txt $ echo 'bar' >> foo.txt $ echo 'baz' >> foo.txt $ cat foo.txt foo bar baz

And we want to print them in uppercase.

$ cat foo.txt | ruby -n -e 'puts $_.upcase' FOO BAR BAZ

Here, the -n flag takes each line being piped in, and puts it in $_ . This is the equivalent of doing this:

while gets puts $_ . upcase end

There are other interesting things we could do with this. We could use BEGIN and END blocks to sort the lines in a file.

$ cat foo.txt | ruby -ne 'BEGIN{ $x=[]}; $x << $_.chomp; END { puts $x.sort }' bar baz foo

The BEGIN block is executed before it starts processing the lines, so we initialize a global variable to contain the lines. The $x << $_.chomp line adds each line to the array. The END block is executed after all lines have been processed.

Now, let’s look at the -a flag that splits the input and stores it in a variable $F . If we put the following text in a file:

$ touch matz.txt $ echo 'matz:ruby' >> matz.txt $ echo 'guido:python' >> matz.txt $ echo 'brendan:js' >> matz.txt $ echo 'jose:elixir' >> matz.txt $ cat matz.txt matz:ruby guido:python brendan:js jose:elixir

and we need to extract the programming language names, we could do it like this:

$ cat matz.txt | ruby -a -F : -ne 'puts $F[1]' ruby python js elixir

That finally brings me to the original problem that I was trying to solve - remove empty lines from a text file:

$ touch empty_lines.txt $ echo 'lorem ipsum' >> empty_lines.txt $ echo '' >> empty_lines.txt $ echo 'lorem ipsum' >> empty_lines.txt $ echo '' >> empty_lines.txt $ echo 'lorem ipsum' >> empty_lines.txt $ echo '' >> empty_lines.txt $ echo 'lorem ipsum' >> empty_lines.txt $ cat empty_lines.txt lorem ipsum lorem ipsum lorem ipsum lorem ipsum

And now, all we need to do to remove those empty lines is:

$ cat empty_lines.txt | ruby -ne 'puts $_ unless $_.chomp.empty?' lorem ipsum lorem ipsum lorem ipsum lorem ipsum

And if we wanted to write it to a file, we can just pipe the output.

$ cat empty_lines.txt | ruby -ne 'puts $_ unless $_.chomp.empty?' >> out.txt

Although special purpose tools like awk are very powerful, we can still use Ruby as a unix utility if we want to.