blog | oilshell.org

Metrics for Oil 0.8.pre2

I wrote about the 0.8.pre2 release in this month's recap, and here are some metrics and benchmarks for it.

This post is mainly for me to keep track of the project's progress. When the codebase is fully translated to C++, I may write a retrospective like this one on parsing speed.

New Metrics

The oil-native build has now existed for 3 months, since Oil 0.7.pre9, so we can review its metrics.

Translated Code

oil-cpp for 0.7.pre9: 48,906 lines of C++ in the tarball

lines of C++ in the tarball oil-cpp for 0.8.pre2: 62,889 lines of C++ in the tarball

Two files accounted for most of the increase:

osh-lex.h contains string matching code, and it now recognizes the names of shell builtins and options. There might be a more compact way to do this, but using re2c is convenient for now. More importantly, we're translating more of the Oil interpreter with mycpp: osh_parse.cc as of 0.7.pre9: 9,687 lines of C++

as of 0.7.pre9: lines of C++ osh_eval.cc as of 0.8.pre2: 16,491 lines of C++. (The name changed because we're translating the word evaluator, the arithmetic evaluator, and more.)

I haven't yet measured the relationship between lines of Python and lines of C++, but it feels like we're translating over half of the ~28K line interpreter.

This is good progress, but it's a significant effort. The code will take several more months to fully translate.

oil-native Parsing Speed

Let's measure against a faster release:

This variation feels like it's within the benchmark noise because the measurements for bash and other shells also dipped. But I'll keep an eye on it.

oil-native Size and Compilation Speed

We're translating and compiling more code, so this increase makes sense.

Note that I expect oil-native to be significantly smaller than the OVM build (measured below).

ovm-build 0.7.0: 28.9 and 10.1 seconds under GCC (two machines)

and seconds under GCC (two machines) ovm-build 0.8.pre2: 49.4 and 16.9 seconds under GCC

and seconds under GCC For comparison, bash compiles in 63 to 70 and 25 to 27 seconds under GCC (two machines, and two different measurements)

The compile time seems to be increasingly linearly with the lines of C++ code.

Established Metrics

Spec Tests

There are 47 new spec tests for OSH:

And almost 29 new for Oil:

Lines of Source Code

There are over 1000 new lines of significant source code:

cloc for 0.7.pre9: 14,156 lines of Python and C, 308 lines of ASDL.

lines of Python and C, lines of ASDL. cloc for 0.8.pre2: 15,272 lines of Python and C, 293 lines of ASDL.

And over 2000 lines of physical source code:

Runtime Speed (OVM only)

Important: this is OVM, the slice of the CPython interpreter, not oil-native.

Both of these numbers are bad. This is why we're translating Oil to C++!

It may have gotten slower: As mentioned in the parser benchmarks retrospective, a side effect of translation is that Oil gets slightly slower when it's run under CPython. But we care about the speed in C++, not in Python.

Soon-to-Be Obsolete

OVM Native Code

Again, we have more lines of native code because of the re2c "matchers" for shell builtin names and option names.

The compiled code size increased by a corresponding amount:

OVM Bytecode

This will also be obsolete, but it increased proportionally with the source code:

Next

I still want to write: