Doing the workshop lead me to think about minimizing shell builtins; one of the questions that comes up a lot is why cd needed to be a builtin, but what doesn't come up until one is much deeper into pipelines and job control is what a pain builtins are, in how they interact with the rest of the shells features. It would be nice to get rid of them.

There are some commands which are builtins only to make them fast, like echo , true , and false . These usually have equivalents in /bin already.

Some builtins are required because they modify the shell's own environment: cd , exit , fg , bg , jobs , exec , wait , ulimit . (This is excluding really tricky, impractical things, like using shared memory, process_vm_writev , or ptrace to modify the shell from an outside process.)

To prove a point, you could take functional programming to an extreme and have an immutable shell where cd executes a new shell in the chosen directory, but some of the others are probably not possible in the presence of typical job control.

If we take this line of thought further, we can try externalizing some of the shell's operators. Conditional execution is interesting. How about && and || ? Syntactically, we probably can't pull these off as external commands, but we could provide commands and and or which take commands to execute.

Implementing if is an obvious next step from and and or . Now we can implement while , although we'd have to be careful about how we handle the environment if we wanted to handle many typical uses of while .

The for loop almost already exists in this form, as xargs . We would probably want to provide both a sequential version, where the environment for each iteration depends on the previous, and a parallel version where everything can run at the same time.

Note that most of these approaches require that you have mechanisms for escaping that aren't too cumbersome, for them to be practical. There seems to be a close parallel with macro facilities in languages like Lisp.

At the extreme side of cumbersome quoting would be case , which you'd probably want to take its input from a heredoc.

I was originally going to write a proof of concept of this (called "builtouts"), but researching this lead me to the intriguing execline "shell", which has already done this, and explored this space rather nicely.

One thing that execline doesn't seem to do is implement something resembling real job control. If bg executes a command without waiting and then re-executes the shell with a suitable variable set (to the PGID of this job), the shell on each execution can check this variable to see what jobs are still alive; the jobs command can print the contents of this variable; the fg command just becomes tcsetpgrp and wait with the PGID of the current job. For an interactive shell, the tricky thing is probably making sure that bg 's children don't end up in an orphaned process group.

A lot of these programs end up having to deal with quoting. Is there a way to take this further and handle quoting in its own program? For fixed-arity programs (like if ), we can imagine an unquote helper that calls a subsidiary program with, first, the fixed remaining arguments, and then all of the original quoted argument, expanded, as the remaining arguments.

As glob(7) notes:

Long ago, in UNIX V6, there was a program /etc/glob that would expand wildcard patterns. Soon afterward this became a shell built-in.

Luckily, the source is available in Diomidis Spinellis's unix-history-repo, and we can see that it does this same kind of chain loading, executing its first argument with the rest of its arguments expanded according to the globbing rules.

I especially enjoy the extremely primitive path search and shell script support.