In the last post, I said I would use Alpine Linux's
abuild script as a motivating use case for OSH.
The good news is that OSH can now run abuild -h to show help. But the bad
news is that, even before running it, OSH takes nearly two seconds to parse
abuild. That's more than 100 times slower than shells like bash!
So in the last few weeks, I've been making a plan to fix this. It resulted in
a lot of problems being "pushed on the stack", taking priority over running
abuild.
I think I can pop the stack by early November, and then make a useful OSH 0.2 release.
This post is as much for me as it is for readers. There are many simultaneous threads running in the project, and I need to organize my thoughts. It's a bit like the post on build system observations: a laundry list of things I may write about later.
These topics deserve a more thorough treatment, but that will have to wait until well after the OSH release.
(1) I researched the space of Python performance tools and tried a handful them, including a tool to make flame graphs and what could be called "flame traces" (pytracing).
I noticed that tracing tools written in Python have significant
instrumentation error. That is, the sys.settrace() and
sys.setprofile() hooks allow you to insert Python callbacks into the main
interpreter loop, but this significantly slows down your program.
This is something to remember for OVM — can I do better?
(2) I now understand the relationship between flame graphs, tree maps, and the Graphviz output of Google's pprof tool. Brendan Gregg has a good post comparing Flame Graphs and TreeMaps.
(3) I ended up just using cProfile in the Python standard library. I should have tried the simple approach first!
It found three (!) performance bottlenecks, which I mentioned in this status update on lobste.rs:
The OSH parser has to be correct before it can be fast, so I reviewed and rewrote what I call the wild tests.
These tests torture the parser with shell scripts found in the wild. For example, see Parsing 183,000 Lines of Git's Shell Source Code.
The wild test harness now runs in parallel and produces nicer HTML:
I also expanded the corpus of testdata. The results showed that I should implement extended globs.
So I explored the extended glob syntax with spec test cases, and added them to the OSH parser. I'm happy that I can still make quick progress on the parser after a few months away for it. This indicates to me that its architecture is solid.
To support my testing and performance tools, I need more tools. It's tools all the way down!
Notice that the URL above has a .wwz path component. This is simply a .zip
file whose contents are served by a CGI script I wrote.
Why did I bother writing it, instead of just unzipping the files on
Dreamhost? Well, there are tens of thousands of
shell scripts and HTML files in the archive, and I want to publish the wild
tests on every release. All this file system metadata is a sys
admin headache not just for my web host, but for me. For example, it makes
rsync take a long time.
The text files also compress by 10-20x.
I also have a side project in mind that will require linking to potentially millioins of source files in projects like CPython, Clang, and various Unix shells.
The benchmarks/osh-parser.R script analyzes performance data. R is a language for statistics, data manipulation, and visualization.
I have a Wiki page called Oil and the R Language where I've been keeping notes on the relationship between R in shell. For example, they're both languages used interactively.
Unfortunately, this work is far in the future.
In 2009 I designed a template language called JSON Template. It was inspired by Google's goog-ctemplate, which was used on the home page of Google, and EZT, which I used on Google Code.
I'm embarrassed that I haven't maintained a website or documentation since Google Code was turned down. There's an ancient fork of it here.
The Python script that produces the wild test report above uses JSON Template.
I should write An Evaluation of JSON Template. The language has flaws, but after 8 years I still like using it for certain tasks.
In addition to measuring performance, I'm thinking about how to improve performance!
To prepare for generating C code from the OSH lexer using re2c, I went back and studied Russ Cox's authoritative and exhaustive articles on regular expressions.
One thing that wasn't clear on skimming is that the virtual machine approach is the slower NFA approach. It has abstract "threads" to represent nondeterminism.
His re2 engine — not related to re2c! — also implements the faster DFA approach. That is, re2 has a multi-engine execution strategy.
On the other hand, re2c uses the DFA approach. It can do this because
it doesn't need to implement all the bells and whistles of Perl-style regexps.
It translates regular expressions into a series of switch and goto
statements in C. Each jump location is a DFA state.
This definitely deserves a future blog post.
I have a clear path to making the lexer faster, but the parser is a more complex problem.
I believe I can make it 10 times faster by addressing the three problems described above (here docs, lexer, and ASDL), but it might still be too slow after that.
I may have underestimated the fact that the OSH parser fundamentally does more work than other shell parsers:
(echo subshell), $(echo "command" "sub"), ${var:-default}, etc.
is parsed up front, which means it finds errors more quickly.To recap the problem:
Yacc and ANTLR aren't appropriate for describing the OSH language.
(These tools are usually called parser generators, but I prefer the term metalanguages. They're languages for describing languages.)
The state of the art is to write a parser by hand in C++.
But I'm not excited about translating 5000 lines of Python to C++ by hand. I would rather generate code.
So I want to create a better metalanguage. After all, I'm not just writing a bash parser, I'm writing Oil, which is supposed to subsume shell, awk, make, and maybe more.
So there are many different threads going now. However I want to pop the stack and focus on delivering something for the next release.
Right now, this looks like:
That will be enough for 0.2 release in early November. The 0.3 release can then include more extensive optimizations.
I diagnosed parser performance problems parser using tools, and enhanced its test suite.
I described some pieces of project infrastructure that I started using, and which I think will be important in the future.
And I also speculated on future plans for the lexer and parser.
If you'd like more details on any of these topics, please leave a comment.