OSH 0.2 - Parsing One Million Lines of Shell

2017-11-10

I've released version 0.2 of OSH, a bash-compatible shell:

https://www.oilshell.org/download/oil-0.2.0.tar.xz (gzip version)

To build and run it, follow the instructions in INSTALL.txt.

Please try it on your shell scripts and report bugs!

Caveat: it's still too big and too slow. However, this release takes the first step toward rectifying that. Read on for details.

Table of Contents

Contributors

Parser Correctness: An Automated Torture Test

The OSH Language is Nearly Defined

Parser Performance: Here Doc Optimization

Other Changes

Summary, and What's Next?

Appendix A: Project Metrics

Spec Tests

Line Counts

Contributors

This is also the first release where someone besides me has changed either the OSH runtime or the test suite! (Several others have sent build patches in the past, but I view these changes as a milestone.)

Looking at the changelog:

Isaac Carroll fixed OSH behavior when argv[0] is -osh, which is how login shells are invoked.
He also implemented -e and -h for the test / [ builtin.
Linus Heckemann contributed a failing spec test case involving an expression like ${x:-}.

Other contributions:

timetoplatypus on lobste.rs contributed an Arch Linux package for OSH. I started a Wiki page listing Oil deployments.

A note on contributing:

OSH is well-tested, and test cases are easy to add. In most cases, writing a good test case solves 50% of the problem. So, if you don't have time to dive into the code, I still appreciate test-only patches.

Parser Correctness: An Automated Torture Test

In the last status update, I mentioned that OSH should run the shell scripts that build Alpine Linux. However, I immediately hit two problems:

The parser is too slow.
It's a bad idea to optimize code before it's correct.

The good news is that this release largely fixes the second problem. I rewrote the wild test harness and fixed all unhandled exceptions uncovered by this "fuzzing". Here are the results:

/release/0.2.0/test/wild.wwz/

The page above shows OSH parsing over a million lines of shell scripts found in the wild. This includes every APKBUILD file in Alpine Linux, which is 5,106 shell scripts totalling 254,565 lines of code.

It also includes:

Tens of thousands of lines of shell scripts from cloud projects like Kubernetes and Docker.
Esoteric projects, including multiple Lisp interpreters written in bash.
The FreeBSD and Minix source trees
Various shell frameworks and libraries

The OSH Language is Nearly Defined

Given the volume and diversity of the scripts that OSH now handles, I would say that the OSH language is almost completely defined.

(The architecture of the parser is strict and principled, but the details must be determined empirically. That is, we parse many scripts and see what fails.)

The summary at the top of the wild test results says that there are 229 scripts that OSH can't parse. However, most of these aren't problems with OSH. Here is a sampling of the underlying reasons:

Actual shell syntax errors that OSH should flag.
Scripts with an .sh extension that aren't shell scripts (e.g. ld-version.sh in the Linux kernel is an Awk script.)
Shell scripts with invalid code after an exit statement.
ksh constructs in Illumos and FreeBSD. For example, $(( x > 3.5 )) is a syntax error because we don't support floating point numbers, but ksh does.

There are also constructs that I've intentionally defined out of OSH. I started documenting these in a nascent OSH manual.

For example, the problem I wrote about with $(( appears in perf-with-kcore.sh in the Linux kernel.

There are also some OSH bugs, like incorrect parsing of bash's regex literals, but this is a small number of cases. Subsequent releases will fix these problems.

Important: This is a good time to suggest other complex shell scripts that OSH should parse. I'll be more likely to include an odd/rare shell construct in the OSH language definition now, as opposed to a year from now.

Parser Performance: Here Doc Optimization

In the last status update, I also mentioned that I uncovered three performance bottlenecks by profiling OSH.

I addressed the first bottleneck in this release by optimizing the parsing of here documents. Surprisingly, I also found correctness problems, which led me to change the algorithm. In the next post, I'll issue a correction to How To Parse Here Documents.

OSH now has benchmarks that show this change. They will be published with every release.

Before optimization: /release/0.2.alpha1/benchmarks/osh-parser.wwz/
After optimization: /release/0.2.0/benchmarks/osh-parser.wwz/

While OSH is still more than a hundred times slower than other shells, it's improved significantly. Parsing a configure script went from 14.8 seconds to 8.5 seconds on a fast machine, and from 32.0 to 23.0 seconds on a slow machine.

My goal for OSH is to parse scripts as fast as other shells, while statically parsing the whole program and producing a more detailed representation of it.

Other Changes

Other notable changes from the changelog:

I addressed issue 47, which was causing problems for packagers. In summary, many distro build system strip binaries, but this breaks the OSH app bundle. Thanks to timetoplatypus for bringing this to my attention.
Miscellaneous features driven by user usage:
- $UID and $EUID
- test -r and -w
Parse the extended glob syntax. (Only parsed, not executed.)
Statically parse assignments more accurately. readonly x=1 is an assignment, while readonly -p is a command that prints the shell's state.
Usability: Add location information to a lot of error messages.
Extensive test suite enhancements and reorganization.

Summary, and What's Next?

In this post, I summarized contributions from three people. Then I wrote about parser correctness and performance. There is new infrastructure for wild tests and new parser benchmarks.

For the next release, I plan to keep my eye on the ball and continue optimizing the parser:

The lexer should use native code generated by re2c.
The lossless syntax tree nodes should generated from the ASDL schema, rather than using metaprogramming at runtime.

The lexer and the LST are like the "bookends" of the parser. Although the parser will still be written in thousands of lines of Python, I hope that optimizing its input and output will make it at least five times faster.

Leave a comment if you have feedback on the direction of the project!

Appendix A: Project Metrics

In this section, I compare this release vs. the previous one. I put significant work into the oilshell.org/release/$VERSION/ trees so that this analysis is possible.

Spec Tests

OSH 0.1: 836 total tests, 662 passing, 123 failing.
OSH 0.2: 863 total tests, 685 passing, 124 failing.

What improves the spec test metrics is implementing new features, like the getopts builtin. In this release, I did more work on the parser, so the numbers haven't moved as much.

Line Counts

OSH 0.1: 13,984 lines of code in the shell itself
OSH 0.2: 14,298 lines of code in the shell itself

Most of the work in this release went toward the supporting infrastructure, which I summarize here (OSH 0.2 only):

# Lines  Category
  2,380  Build Automation
  5,242  Test Automation
    970  Benchmarks
    974  Web (used by tests and benchmarks)
  -----  -----
  9,566  TOTAL

  7,116  Spec Tests
  4,253  Oil Unit Tests
    543  Other Unit Tests
  -----  -----
 11,912  TOTAL

The supporting code adds up to 21,478 lines of code, dwarfing OSH itself! I view this is as a good thing. It keeps the project on track, but adds no weight to what we ship.

This gives me confidence that OVM can be made smaller and lighter, which will make OSH smaller and lighter.