Why Sponsor Oils? | blog | oilshell.org
In the last post, I said I would use Alpine Linux's
abuild
script as a motivating use case for OSH.
The good news is that OSH can now run abuild -h
. The bad news is that,
before running it, OSH takes nearly two seconds to parse abuild
. That's
more than 100 times slower than shells like bash!
The fundamental reason for this is that I wrote the parser in a high-level, abstract style, so I wouldn't end up with a mess of ad hoc if statements. I focused on correctness over performance.
Of course, performance is important too. This post meanders over topics that are relevant to addressing this problem. It's a bit like the post on build system observations: a laundry list of things I may write in detail about later.
I also make a digression into project infrastructure. There are many simultaneous threads running in the project, and writing things down helps me organize my thoughts and prioritize.
(1) I researched the space of Python performance tools and tried a handful them, including a tool to make flame graphs and what could be called "flame traces" (pytracing).
I noticed that tracing tools written in Python have significant
instrumentation error. That is, the sys.settrace()
and
sys.setprofile()
hooks allow you to insert Python callbacks into the main
interpreter loop, but this significantly slows down your program.
This is something to remember for OVM — can I do better?
(2) I now understand the relationship between flame graphs, tree maps, and the Graphviz output of Google's pprof tool. Brendan Gregg has a good post comparing Flame Graphs and TreeMaps.
(3) I ended up just using cProfile in the Python standard library. I should have tried the simple approach first! Flame graphs seem more appropriate for programs which are already somewhat optimized.
cProfile found three (!) performance bottlenecks, which I mentioned in this status update on lobste.rs:
(4) Using these tools has also made me appreciate the utility of unifying stack traces between C and languages like Python and JavaScript. Python and C have fairly good integration, but when you start using performance code coverage tools, the gap between them appears wider.
Joyent has done some interesting work on observability that helps bridge this gap.
However, the OSH parser has to be correct before it can be fast, so I reviewed and rewrote what I call the wild tests.
These tests torture the parser with shell scripts found in the wild. For example, see Parsing 183,000 Lines of Git's Shell Source Code.
The wild test harness now runs in parallel and produces nicer HTML:
I also expanded the corpus of testdata. The results showed that I should implement extended globs.
So I explored the extended glob syntax with spec test cases, and added them to the OSH parser. I'm happy that I can still make quick progress on the parser after a few months away for it. This indicates to me that its architecture is solid.
To support my testing and performance tools, I need more tools. It's tools all the way down!
Notice that the URL above has a .wwz
path component. This is simply a .zip
file whose contents are served by a CGI script I wrote.
Why did I bother writing it, instead of just unzipping the files on Dreamhost? Well, there are tens of thousands of shell scripts and HTML files in the archive, and I want to publish the wild tests on every release. All this file system metadata is a sys admin headache not just for my web host, but for me.
For example, it makes rsync
take a long time. The text files also compress
by 10-20x.
I also have a side project in mind that will require linking to potentially millions of source files in projects like CPython, Clang, and various Unix shells.
The benchmarks/osh-parser.R script analyzes performance data. R is a language for statistics, data manipulation, and visualization.
I have a Wiki page called Oil and the R Language where I've been keeping notes on the relationship between R and shell. For example, both languages are used interactively!
Unfortunately, this work is far in the future.
In 2009 I designed a template language called JSON Template. It was inspired by:
It was designed to be "polyglot" in the sense that I implemented it in Python and JavaScript simultaneously. I felt that templates should be portable between the client and server, which wasn't a common at the time.
I'm embarrassed that I haven't maintained a website or documentation since Google Code was turned down. There's an ancient fork of it here.
I used JSON Template in the Python script that produces the wild test report above.
After more than 8 years, I should write An Evaluation of JSON Template. The language has flaws, but I still like using it for certain tasks. It will influence the design of double-quoted strings in the Oil language.
In addition to measuring performance and digressions into tools, I'm thinking about how to improve performance.
To prepare for generating a lexer in C using re2c, I went back and studied Russ Cox's authoritative and exhaustive articles on regular expressions.
One thing that wasn't clear on skimming is that the "virtual machine approach" is the NFA approach. It has abstract "threads" to represent nondeterminism.
His re2 engine — not related to re2c! — also implements the faster DFA approach. That is, re2 has a multi-engine execution strategy.
On the other hand, re2c just uses the DFA approach. It can do this
because it doesn't need to implement all the bells and whistles of Perl-style
regexps. It translates regular expressions into a series of switch
and
goto
statements in C. Each jump location is a DFA state.
This definitely deserves a future blog post.
I have a clear path to making the lexer faster, but the parser is a more complex problem.
I believe I can make it 10 times faster by addressing the three problems listed above (here docs, lexer, and ASDL), but it might still be too slow after that.
I may have underestimated the fact that the OSH parser fundamentally does more work than other shell parsers:
(echo subshell)
, $(echo "command" "sub")
,
${var:-default}
, etc. is not skipped over, which means it finds more errors
in your scripts. It also means the parser is more consistent and less complex
overall.To recap the problem:
(These tools are usually called parser generators, but I prefer the term metalanguage. They're languages for describing languages.)
The state of the art is to write a parser by hand in C++.
But I'm not excited about translating 5,000+ lines of Python to C++ by hand. I would rather generate code.
So I want to create a better metalanguage. After all, I'm not just writing a bash parser, I'm writing Oil, which is supposed to subsume shell, awk, make, and maybe more.
So there are many different threads going now. However I want to pop the stack and focus on delivering something for the next release.
Right now, this looks like:
That will be enough for a 0.2 release in early November. The 0.3 release can then include more extensive optimizations.
I diagnosed parser performance problems using tools, and enhanced its test suite.
I described project infrastructure that I started using: wwz
,
R, and JSON Template.
I also speculated on future plans for the lexer and parser.
If you'd like more details on any of these topics, please leave a comment.