blog | oilshell.org
I wrote about the 0.8.pre2 release in this month's recap, and here are some metrics and benchmarks for it.
This post is mainly for me to keep track of the project's progress. When the codebase is fully translated to C++, I may write a retrospective like this one on parsing speed.
The oil-native build has now existed for 3 months, since Oil 0.7.pre9, so we can review its metrics.
Two files accounted for most of the increase:
osh-lex.hcontains string matching code, and it now recognizes the names of shell builtins and options. There might be a more compact way to do this, but using re2c is convenient for now.
osh_parse.ccas of 0.7.pre9: 9,687 lines of C++
osh_eval.ccas of 0.8.pre2: 16,491 lines of C++. (The name changed because we're translating the word evaluator, the arithmetic evaluator, and more.)
I haven't yet measured the relationship between lines of Python and lines of C++, but it feels like we're translating over half of the ~28K line interpreter.
This is good progress, but it's a significant effort. The code will take several more months to fully translate.
Let's measure against a faster release:
This variation feels like it's within the benchmark noise because the measurements for bash and other shells also dipped. But I'll keep an eye on it.
oil-nativeSize and Compilation Speed
We're translating and compiling more code, so this increase makes sense.
Note that I expect oil-native to be significantly smaller than the OVM build (measured below).
The compile time seems to be increasingly linearly with the lines of C++ code.
There are 47 new spec tests for OSH:
And almost 29 new for Oil:
There are over 1000 new lines of significant source code:
And over 2000 lines of physical source code:
Important: this is OVM, the slice of the CPython interpreter, not oil-native.
configureis 5.6x to 7.3x slower.
configureis 6.2x to 7.5x slower.
Both of these numbers are bad. This is why we're translating Oil to C++!
It may have gotten slower: As mentioned in the parser benchmarks retrospective, a side effect of translation is that Oil gets slightly slower when it's run under CPython. But we care about the speed in C++, not in Python.
Again, we have more lines of native code because of the re2c "matchers" for shell builtin names and option names.
The compiled code size increased by a corresponding amount:
This will also be obsolete, but it increased proportionally with the source code:
I still want to write: