blog | oilshell.org
(We're in the middle of a series of posts to bring readers up-to-date on the project.)
I've said many times that Oil is too big and too slow. So I'm happy to explain today that performance is no longer a project risk.
As of the 0.7.pre9 release last month, the parser is translated to C++, and the plan is to optimize the rest of the codebase in the same way.
This post walks you through large improvements in the parsing benchmarks. (The translator is called mycpp, and I'll write more about it later.)
Many have rightly wondered why Oil is written in Python. To put it another way, why did I implement the interpreter in an abstract style?
Because I prioritized correctness over speed. Considered alone, Oil's parser is a complicated program with many design issues:
Despite its abstract style, Oil's parser is now faster than bash's parser. As foreshadowed in the very first blog post, Oil uses Python as a "metaprogramming" language for C++.
The parsing benchmark runs
$sh -n on 10 shell scripts found "in the wild".
-n flag parses the script but doesn't execute it. I take
measurements on a slow machine and a fast machine (with a Core i3 and i7 CPU,
These benchmarks have been run on every release in the past two years, long before I knew had to optimize Oil! And there were some significant false starts. Summary:
GNU coreutils includes one of the biggest shell scripts I've
configure script which is 1.7 MB and 69,779 lines,
generated by autoconf.
On the slow machine, it used to take over ~20 seconds to parse with Oil.
It now takes under 200 milliseconds with
oil-native, compared with over
200 ms on bash and over 2,000 ms on zsh.
I've rounded off these numbers because of benchmark noise, but you can see exact measurements in the links below.
Excerpt from the 0.7.pre10 benchmarks.
|Implementation||Parsing Rate (lines/ms)||Notes|
|slow machine||fast machine|
||310||869||Oil isn't a complete shell yet. Its parser has hooks for autocompletion and history. It does a "deep parse" and detects 3 out of 3 syntax errors.|
|230||614||The bash parser doesn't know anything about autocompletion or history. It detects 0 of 3 syntax errors.|
|28||98||The zsh parser is aware of autocompletion. It detects 1 of 3 syntax errors.|
In December 2017, I translated the regex-based lexer to native code with re2c. I was curious how much faster it would be, so I created this parsing benchmark.
Here's a summary of notable changes over 2 years. I went to great lengths to prove that you don't have to trade correctness for speed :-)
|Release||Implementation||Parsing Rate (lines/ms)||Notes|
|slow machine||fast machine|
||1.9||4.3||A principled parser written for correctness, ignoring speed.|
||6.3||13.4||Translated the lexer to native code via re2c. The parser is still in Python.|
The parser was slow for 2 years! Instead, I worked on running
thousands of lines of unmodified shell scripts,
implemented the interactive shell, and prototyped the
In other words, the strategy was to "make it right" before making it fast. In 2017, Oil could only run simple shell scripts, but it's now very capable.
||3.8||13.3||This release was all about translation: improving mycpp and making small changes to Oil itself. Surprise: the code got slower under CPython because some idioms were changed. But it doesn't matter because we care about the speed of native code.|
||3.6||12.0||More work on translation.|
||310||869||I optimized the parser with performance tools for C++, not performance tools for Python! They were automated with shell scripts. A separate blog post will describe this.|
|0.3.alpha1 vs. 0.7.pre10||cumulative speedup||163 x||202 x||Oil is fast!|
Oil's parser doesn't deallocate any memory now. I also expect to use a local arena allocator, which may slow it down.
However, there are also several ways to speed it up, like sharing the bytes behind string slices rather than copying them.
I expect performance to go up and down in future releases, but in the long term it should be faster. The mycpp translation is rough and there's a lot of low hanging fruit.
Parsing isn't the most important aspect of shell performance, but it is important. Shells have to run large auto-generated shell scripts quickly.
The more important takeaway is that I'd like rest of the shell interpreter to be translated using this same process. There are a few technical differences, which I'll discuss in a future post.
If translating statically-typed Python to C++ sounds interesting to you, I'm looking for help! Adding type annotations is a prerequisite. Leave a comment or chat with us on Zulip.
I'm following the time-based blogging strategy, so I cut a few things out of this post.
I'd lke to discuss:
pre10. These had nothing to do with Python, and sped up the parser over 2x! Perf tools for C++ are much better than those for Python.
oil-nativemetrics: I'm measuring build time, binary size, and publishing results from Bloaty. See the OVM Build benchmarks and oil-native/overview.txt.