A Unix Shell in Ruby - Part 3: A Login Shell and the PATH

Published on March 10, 2012 by Jesse Storimer

Previously, in this series, we saw how to implement some shell built-ins that are really necessary for the shell to function. This time, we'll make sure our shell can be used as a login shell and see how to interact with environment variables.

First, A Word of Warning

In this article I'm going to set up shirt as the default login shell for my system to show what will happen when it's treated like every other shell. Be wary of doing this yourself. I definitely encourage you to follow along at home, but make sure you read through the article first so you know what to expect and don't end up with a broken shell :)

Changing the Login Shell

Each user can specify which shell they want to use when opening a new terminal or logging in to the sytem.

Changing the login shell can be done with the chsh(1) command. On OSX it can also be changed through the GUI. Let's change my login shell to shirt .

$ sudo chsh -s /path/to/shirt ` whoami `

If I do that and launch a new terminal it dies immediately :/ So it appears that something didn't work.

With a bit of digging I found that this is due to the 'shebang' line at the beginning of the program. It tells the system to use the version of ruby specified by env(1). The env(1) command has everything to do with environment variables. Let's talk about those.

Environment Variables

Environment variables are a process-generic method of sharing data. I say process-generic because most any programming language has a way of reading and writing environment variables. I'll be referring to the collection of current environment variables as 'the environment'.

We'll talk more about the environment in a bit, what we care about right now is how come shirt died when we booted it as a login shell?

One of the most important environment variables is called PATH . The PATH is an ordered list of directories specifying where the shell and the system calls should look for programs to execute. For example, a common PATH looks like this:

$ echo $PATH /usr/local/bin:/usr/bin:/bin

Each directory is separated by a colon. This PATH tells the system to first look for programs in /usr/local/bin , then, if it finds nothing there, look in /usr/bin , then /bin . In this way you can have a custom install of vim in /usr/local/bin that would be favoured over the system default vim in /usr/bin .

How the PATH Broke Our Login Shell

When a login shell boots it has no customization. The system default, when booting a new login shell, is to set the PATH to the following:

/usr/bin:/bin:

So when shirt boots in this way the shebang line looks in the environment to find ruby . With this PATH the only version it will find is the system default at /usr/bin/ruby . Since I'm on OS X the system default is MRI 1.8.7. So shirt is being booted with this ruby.

In my everyday shell I have the rbenv shims at the start of my PATH so that when I run the ruby command it's always handled by rbenv.

As a responsible Ruby developer I have my global rbenv version set to MRI 1.9.3. Let's simulate what's happening with the login shell by trying to boot shirt with the system ruby:

$ /usr/bin/ruby shirt shirt:5: syntax error, unexpected '=', expecting '|' 'cd' => lambda { |dir = ENV["HOME"]| Dir.chdir(dir) }, ^ shirt:5: syntax error, unexpected '}', expecting $end 'cd' => lambda { |dir = ENV["HOME"]| Dir.chdir(dir) },

Aha. So we're using some syntax that is only supported in Ruby 1.9. I'm OK with that, but that means we'll need to have 1.9 installed as the system default in order for shirt to be a login shell. Rather than actually building Ruby and replacing the version at /usr/bin/ruby I'm just going to move that one out of the way for now and symlink in my rbenv ruby.

$ mv /usr/bin/ruby /usr/bin/ruby.orig $ ln -s ` rbenv which ruby ` /usr/bin/ruby

Now if I open a new shell it works!

Back on the Path

If you actually try to use this as your login shell you'll probably be frustrated, there's many things that still aren't implemented. Let's continue working with the PATH .

Hypothetically, the first thing I need to do in this shell session is work with redis-cli . I installed redis through homebrew, so that program was installed into /usr/local/bin . If I try running redis-cli from inside the shirt login shell I get an error saying that it couldn't be found.

I can use the env(1) command to inspect the current environment variables and I see that my PATH , as expected, doesn't include /usr/local/bin . Hence it can't find my redis-cli . So we need a way to set environment variables.

Since this will need to change the state of the shell itself, we'll implement it as a builtin.

Like Bash?

How does bash handle this? It uses a command called export(1). Here's how we'd add a custom value to the PATH in bash .

$ export PATH = /usr/local/bin: $PATH

export(1) is telling bash to set the variable PATH to the previous value of PATH (specified by $PATH ) prepended with /usr/local/bin .

Personally, I think export is a little unclear as far as names go, so we'll use set to specify changes to environment variables shell-wide.

Setting Environment Variables

Here's the implementation for shirt :

BUILTINS = { . . . 'set' => lambda { | args | key , value = args . split ( '=' ) ENV [ key ] = value } }

Notice how we simply set a key on the ENV constant? ENV is the Ruby interface to environment variables for the current process.

It's a pretty naive approach, but it works. When using the set command it simply splits the first argument on the '=' character. The first part is assumed to be the variable name, the second part is considered to be its value.

Now issue the following commands to change and inspect the PATH for our shell:

-> set PATH=/usr/local/bin:/usr/bin:/bin -> env

Notice that I specified the PATH in full, rather than referring to the existing value ( $PATH ) as we did with bash . shirt doesn't yet have a way to substite variables (or substitute other things a la backticks) into the command string. We'll tackle that problem all at once in another article.

Now when I try to run my redis-cli from /usr/local/bin it works just fine!

Connecting the Dots

This comes full circle when we take into account that 1) a child process inherits the environment of its parent, and 2) the environment is preserved through an exec.

That's how come shirt was able to work previously and find programs in /usr/local/bin . I had added /usr/local/bin to my PATH as part of my .bashrc file, so when I would exec or just launch an instance of shirt it would inherit that environment, along with the customized PATH .

Similarly, when shirt forks a process to exec a command, the environment (and subsequently the PATH ) is passed all the way down. This is how exec finds which program it should become when you pass it a program that's not at an absolute path.

'Local' Environment Variables

The set command we implemented can change environment variables shell-wide, but what about when you want to change an environment variable just for the invocation of one command? This is very common (as you'll see from the examples below) and we can actually do it without changing shirt at all.

# Here's how bash does it $ RAILS_ENV = staging rails server $ VERBOSE = 1 make install # Here's how shirt does it (also works in bash) -> env RAILS_ENV = staging rails server -> env VERBOSE = 1 make install

The env(1) command is available on most systems and fits this purpose exactly. bash has that nice shorthand, but I don't feel the need to add that extra parsing code right now when the env(1) command does it so nicely.

More Environment Variables

Now that we have a facility for changing environment variables we can add some more customization to our shell. The prompt, for instance.

. . . ENV [ 'PROMPT' ] = '-> ' loop do $stdout . print ENV [ 'PROMPT' ] . . .

Now the prompt can be customized simply by setting the PROMPT environment variable. It's still primitive in that you can't dynamically use commands in the prompt line, but that will be added later when we have a more general way of evaluating command strings.

A Refactor

Before we end this article I want to do some refactoring: I'd like to keep everything in one file for now so the source is easy to read and easy to follow, but the problem is that we now have a bunch of boilerplate code before the actual meat of the program.

We first define all of our builtins, as well as setting up a default environment variable, before we actually show the logic of the program. Let's reverse those two. Here's shirt after the refactor.

The notable changes are that the program logic moved to the top of the file into a method called main . This method gets invoked on the last line of the file.

#!/usr/bin/env ruby require 'shellwords' def main loop do $stdout . print ENV [ 'PROMPT' ] line = $stdin . gets . strip command , * arguments = Shellwords . shellsplit ( line ) if BUILTINS [ command ] BUILTINS [ command ]. call ( * arguments ) else pid = fork { exec line } Process . wait pid end end end BUILTINS = { 'cd' => lambda { | dir = ENV [ "HOME" ]| Dir . chdir ( dir ) }, 'exit' => lambda { | code = 0 | exit ( code . to_i ) }, 'exec' => lambda { |* command | exec * command }, 'set' => lambda { | args | key , value = args . split ( '=' ) ENV [ key ] = value } } ENV [ 'PROMPT' ] = '-> ' main

As always, the source is on Github.

In part 4 we'll finish up with some search path stuff, then I'll show some pretty peculiar behaviour which will lead to implementing pipes in Ruby! Don't miss it!