blog | oilshell.org
I just released Oil 0.8.4, another huge one filled with OSH and Oil language changes! I kept track of the changes on Zulip, and you can view the changelog, but I'll summarize them in the next post.
On the other hand, this post is a short note to review release #metrics.
In particular, I want to keep track of the binary size and build speed issues mentioned in the June announcement of the 0.8.pre6 release.
I won't circulate this post widely — it's mainly for those close to the project. But it's important to keep track of our progress along multiple dimensions.
Previous posts with #metrics:
That brings to mind another motivation for this post: preparing for garbage collection. In September, I wrote a small copying collector with unit tests, and partially integrated it into oil-native. I even showed it to a few people, and wasn't embarrassed. But it's not fully running yet.
Garbage collection is the top priority for the coming months (in addition to translating shell I/O). As mentioned, it will necessarily slow the code down.
It's not clear how much, but this post marks a rough baseline for performance.
I should note that essentially all of my effort is spent implementing features, fixing bugs, and getting Oil to translate and compile as C++. I've spent essentially zero time on optimization since December 2019.
I say that because we are experiencing classic "C++ bloat" problems due to templates and exceptions. However, just like bash is no longer "too big and too slow", neither is C++! Modern software has made these old technologies look efficient by comparison.
For example, you'll see below that the Oil binary is about 20-30% bigger than bash right now, e.g. 1.3 MB vs. 1.0 MB. It will get bigger, but it won't reach 5, 10, or 15 MB like similar programs written in Go or Rust.
These are the things we care most about, and they're looking good.
Almost 300 new tests pass in oil-native:
I also reviewed these
spec-cpp metrics in August, in A Plan for Oil 0.8 and
0.9. Again, the goal is for the 917
osh_eval.cc number to reach the 1672
osh (in Python) number.
OSH spec tests indicate many new features:
And so do Oil spec tests:
I described some of these in the previous post, and I'll talk more about them in the next post. Both the OSH and Oil languages are taking a nice shape!
The parsing benchmarks are still noisy. I'm not sure if this change is significant, but oil-native is still faster than bash at parsing. I should probably switch to something more stable, like instruction counts.
The runtime benchmark measures the old Python build, not oil-native,
so we don't care about this. What's critically important is to simply
configure scripts under oil-native!
This is the reason we're spending so much effort translating Oil to C++! I haven't made a big deal about it on the blog, but it's an obvious problem.
I wrote some synthetic benchmarks to test shell "computation":
And here are some rough measurements of mycpp's translation:
Summary: We get a huge speedup on most code, but there are still performance bugs where the translated code is slower than Python! At least one of these is a computational complexity bug.
Again, we compare this release with June's 0.8.pre6 release.
These are the lines we edit, not those generated. It's still pretty small!
Let's add in Oil language (also counted in
Note that OSH and Oil share a lot of common libraries, which are counted under OSH.
Nevertheless, I'm surprised by the small increase, and that's a good thing! I think it's because most of the recent changes happened in the grammar, which is small.
I also included the new Tea language! Many thanks to Batuhan Taskaya for recent help on that. I hope to write more about it soon.
Almost all lines in the oil-native tarball are generated, and we
continue to count them. I've also counted
osh_eval.cc, the translation of
the core interpreter, by itself.
This is expected progress, which reflects three things:
The binary is getting bigger along with the lines of translated code. Refer to the June announcement of 0.8.pre6 for the reasons behind this.
(And I still have to figure out why the size of
so much between GCC and Clang. Guesses: templates, exception tables, or both.)
This is bad! Compile time basically doubled.
I believe this is due to template bloat. We introduced
gc_heap::Alloc<T>(...) instead of
new T(...), which uses
This shows up in the report from Bloaty:
7 gc_heap::Alloc<>()::__PRETTY_FUNCTION__ 48821 107976 8 _GLOBAL__sub_I_str0 61208 61252 9 [section .debug_abbrev] 0 54882 10 gc_heap::Alloc<>() 23494 41924
I plan to look into this further, but again, I think we'll have to live with it for awhile. I'm focused on making Oil usable and featureful.
(It's also interesting that Clang was faster on the old code, but is now slower. This pattern held up in 0.8.3 too, so it's not benchmark noise.)
Overall, the build speed is the thing I'm most annoyed by. I expect it to get
worse once we fully integrate the garbage collector. For example, I need to
generate field masks for every type in the program, and that involves some
compile-time computation, e.g. with
If you're experienced with these issues, I'd love some help! Let me know in the comments.
Again, I think you compare Oil to a Go or Rust binary, none of this is a big deal. But I want there to be "no reason to use bash rather than Oil", and these issues matter for embedded systems, which occasionally use bash.
But it's much more important to solidify the OSH language and the Oil language. The next post will talk about that work, which includes:
###, and the
set -e. Fixing
errexitis one of the Four Features that Justify a New Unix Shell. But we'll also help users of existing shells!