blog | oilshell.org

Release of OSH 0.6.pre2

This is the latest version of OSH, a bash-compatible shell:

Please try it on your shell scripts and report bugs! To build and run it, follow the instructions in INSTALL.txt.

If you're new to the project, see Why Create a New Shell?. OSH can run unmodified shell scripts that are thousands of lines long, as described in the announcement for OSH 0.4.

Although I released version 0.6.pre1 just a few days ago, I'm making another release because Unicode support in basically done, and the binary is significantly smaller. I continue to chip away the problem of OSH being too big and too slow.

The release process involves an array of tests and benchmarks, which usually catch bugs. This release was no exception.

The following sections summarize the raw changelog.

Visible Changes

OSH now respects UTF-8 in two places:

The string length operator ${#s} counts code points, not bytes. String slicing like ${s:1:3} also counts code points, not bytes.

In addition to implementing #2, I cleaned up related error conditions:

UTF-8 decode errors are fatal under set -o strict-word-eval , an OSH-specific mode. (TODO: Write a blog post about the bad behavior of bash and zsh in this situation, under #shell-the-bad-parts)

, an OSH-specific mode. (TODO: Write a blog post about the bad behavior of bash and zsh in this situation, under #shell-the-bad-parts) String slices with negative arguments like ${s: -1 : -3} are disallowed. Another candidate for #shell-the-bad-parts: the second argument to slicing is a length if 0 or positive, but a position if negative!

are disallowed. Another candidate for #shell-the-bad-parts: the second argument to slicing is a if or positive, but a if negative! set -o strict-arith was re-enabled. In this mode, an expression like $((s+5)) will fail if s is an invalid string like abc , rather than coercing to 0 and returning 5 . (OSH is also the only shell that has this!)

NOTE: The motivation for implementing UTF-8 ourselves was to remove Python's unicodeobject.c from the build, which is over 10K lines of code. This was done in OSH 0.5.

Under the Hood

ASDL schemas are no longer included in the binary. We compile them offline into Python code and reflection metadata (now stored with pickle, an tiny VM that can serialize graphs with sharing ).

). Removed the dependency on platform module, which was just a wrapper around os.uname() and sys.version .

module, which was just a wrapper around and . As a result of those two changes, we no longer need the re module. So files like _sre.c and sre_parse.py from CPython have been removed. We now depend on ~10K fewer lines of code! Detailed measurements are below.

module. So files like and from CPython have been removed. We now depend on fewer lines of code! Detailed measurements are below. Removed bytearrayobject.c . (Late in the release process, I discovered that this broke oheap encoding, which was an experiment not visible to users.)

. (Late in the release process, I discovered that this broke oheap encoding, which was an experiment not visible to users.) Removed unicodeobject.h (a straggler) and future.c from the build.

Selected Metrics

The tests and metrics published with each release quantify the changes above:

Lines of native code went down because I removed more parts of CPython:

Lines of Python code also went down because I removed parrts of CPython as described above:

This is reflected in the smaller binary, which is faster to build:

The bytecode size went down as well: