Oil's Parser is 160x to 200x Faster Than It Was 2 Years Ago

2020-01-16

(We're in the middle of a series of posts to bring readers up-to-date on the project.)

I've said many times that Oil is too big and too slow. So I'm happy to explain today that performance is no longer a project risk.

As of the 0.7.pre9 release last month, the parser is translated to C++, and the plan is to optimize the rest of the codebase in the same way.

This post walks you through large improvements in the parsing benchmarks. (The translator is called mycpp, and I'll write more about it later.)

Table of Contents

Background

Summary of Benchmark Results

A Concrete Data Point

Comparing Shells

Performance Improvements in Oil

Caveats

Conclusion

Appendix: Performance Details

Background

Many have rightly wondered why Oil is written in Python. To put it another way, why did I implement the interpreter in an abstract style?

Because I prioritized correctness over speed. Considered alone, Oil's parser is a complicated program with many design issues:

Figuring out corner cases of shell syntax, like those mentioned at the end of the last post.
Figuring out multiple parsing algorithms and how to interleave them.
- Note that Oil is at a disadvantage in these benchmarks, because it does more work than other shell parsers. This post shows that OSH statically detects 3 out of 3 syntax errors in ${}, $(), and $(()), whereas bash detects zero out of 3.
Interactive features like autocompletion and history require support from the parser. The next post will discuss this.

Despite its abstract style, Oil's parser is now faster than bash's parser. As foreshadowed in the very first blog post, Oil uses Python as a "metaprogramming" language for C++.

Summary of Benchmark Results

The parsing benchmark runs $sh -n on 10 shell scripts found "in the wild". The POSIX -n flag parses the script but doesn't execute it. I take measurements on a slow machine and a fast machine (with a Core i3 and i7 CPU, respectively.)

These benchmarks have been run on every release in the past two years, long before I knew had to optimize Oil! And there were some significant false starts. Summary:

On the fast machine, Oil parses at ~870 lines per millisecond, while bash parses at ~620 lines/ms. zsh parses at ~100 lines/ms.
Oil parsed at 4 lines/ms two years ago when it ran under CPython. So it's now 160x - 200x faster!

A Concrete Data Point

GNU coreutils includes one of the biggest shell scripts I've encountered: a configure script which is 1.7 MB and 69,779 lines, generated by autoconf.

On the slow machine, it used to take over ~20 seconds to parse with Oil. It now takes under 200 milliseconds with oil-native, compared with over 200 ms on bash and over 2,000 ms on zsh.

I've rounded off these numbers because of benchmark noise, but you can see exact measurements in the links below.

Comparing Shells

Excerpt from the 0.7.pre10 benchmarks.

Implementation	Parsing Rate (lines/ms)		Notes
	slow machine	fast machine
`osh-native` 0.7.pre10	310	869	Oil isn't a complete shell yet. Its parser has hooks for autocompletion and history. It does a "deep parse" and detects 3 out of 3 syntax errors.
bash 4.3.48(1)-release (x86_64-pc-linux-gnu)	230	614	The bash parser doesn't know anything about autocompletion or history. It detects 0 of 3 syntax errors.
zsh 5.1.1 (x86_64-ubuntu-linux-gnu)	28	98	The zsh parser is aware of autocompletion. It detects 1 of 3 syntax errors.

Performance Improvements in Oil

In December 2017, I translated the regex-based lexer to native code with re2c. I was curious how much faster it would be, so I created this parsing benchmark.

Here's a summary of notable changes over 2 years. I went to great lengths to prove that you don't have to trade correctness for speed :-)

Release	Implementation	Parsing Rate (lines/ms)		Notes
		slow machine	fast machine
0.3.alpha1 2017-12-09	`osh-cpython` benchmarks	1.9	4.3	A principled parser written for correctness, ignoring speed.
0.3.0 2017-12-21	`osh-cpython` benchmarks	6.3	13.4	Translated the lexer to native code via re2c. The parser is still in Python.
0.7.pre5 2019-10-04	`osh-cpython` benchmarks	5.6	14.8	The parser was slow for 2 years! Instead, I worked on running thousands of lines of unmodified shell scripts, implemented the interactive shell, and prototyped the Oil language. In other words, the strategy was to "make it right" before making it fast. In 2017, Oil could only run simple shell scripts, but it's now very capable.
0.7.pre6 2019-11-11	`osh-cpython` benchmarks	3.8	13.3	This release was all about translation: improving mycpp and making small changes to Oil itself. Surprise: the code got slower under CPython because some idioms were changed. But it doesn't matter because we care about the speed of native code.
0.7.pre8 2019-12-06	`osh-cpython` benchmarks	3.6	12.0	More work on translation.
0.7.pre9 2019-12-08	`osh-native` benchmarks	102	351	The first `oil-native` release. A ~30x improvement made it faster than zsh, but it's still slower than bash.
0.7.pre10 2019-12-19	`osh-native` benchmarks	310	869	I optimized the parser with performance tools for C++, not performance tools for Python! They were automated with shell scripts. A separate blog post will describe this.
0.3.alpha1 vs. 0.7.pre10	cumulative speedup	163 x	202 x	Oil is fast!

Caveats

Oil's parser doesn't deallocate any memory now. I also expect to use a local arena allocator, which may slow it down.

However, there are also several ways to speed it up, like sharing the bytes behind string slices rather than copying them.

I expect performance to go up and down in future releases, but in the long term it should be faster. The mycpp translation is rough and there's a lot of low hanging fruit.

Conclusion

Parsing isn't the most important aspect of shell performance, but it is important. Shells have to run large auto-generated shell scripts quickly.

The more important takeaway is that I'd like rest of the shell interpreter to be translated using this same process. There are a few technical differences, which I'll discuss in a future post.

If translating statically-typed Python to C++ sounds interesting to you, I'm looking for help! Adding type annotations is a prerequisite. Leave a comment or chat with us on Zulip.

Appendix: Performance Details

I'm following the time-based blogging strategy, so I cut a few things out of this post.

I'd lke to discuss:

The C++ optimizations between Oil 0.7.pre9 and pre10. These had nothing to do with Python, and sped up the parser over 2x! Perf tools for C++ are much better than those for Python.
Naively generated C++ is fast. Making Oil's parser faster than bash's parser surprised me, because the C++ code looks inefficient. For example, everything is a pointer and there are no value types, because Python can't express value types.
New oil-native metrics: I'm measuring build time, binary size, and publishing results from Bloaty. See the OVM Build benchmarks and oil-native/overview.txt.