blog | oilshell.org

Oil 0.8.pre4 - The Biggest Shell Programs in the World

2020-04-19

This is the latest version of Oil, a Unix shell that's our upgrade path from bash:

Oil version 0.8.pre4 - Source tarballs and documentation.

To build and run it, follow the instructions in INSTALL.txt. The wiki has tips on How To Test OSH.

Table of Contents
The Highlights
Comment on Version Numbering
Closed Issues
Selected Open Issues
Under the Hood: The Code Is in Good Shape After 4 Years
Dependency Inversion Leads to Pure Interpreters
Lexer Modes
Conclusion
Appendix A: More Bad Parts of Shell
Appendix B: Metrics for the 0.8.pre4 Release
Test Results
Benchmarks
Native Code Metrics

The Highlights

This release has four user-facing themes:

  1. Fixes to Oil so that it can run neofetch, a bash program that's more than 10,000 lines long. Thanks to Crestwave for the tough job of debugging neofetch under Oil, and for patching it upstream!
  2. Fixes and features toward running ble.sh, another one of the biggest shell programs in the world. Thanks to Koichi Murase for the phenomenal testing and bug reports!
  3. Optimize the number of processes started. The last post with comics gives background knowledge on this, and the next blog post will explain further.
  4. QSN: Quoted String Notation. A new interchange format that formalizes string literals like 'foo \x00 bar\n.

Comment on Version Numbering

Despite the pre4 version qualifier, this is by far the best Oil release ever. I use Oil interactively while doing the release, running thousands of lines of its own shell scripts in the process.

I may change the version numbering scheme in the near future to reflect this. Note that this release includes a new $OIL_VERSION variable (issue #683 below).

Closed Issues

Here are some issues addressed in this release. It's an underestimate because I also fixed many bugs under issue 653 to run ble.sh.

You can also view the full changelog.

#712 ternary operator ? should be right associative
#706 unset should unshadow variables higher on the stack (at least for nonlocals)
#705 read fails on empty lines
#702 Can't escape closing brace with backslash or single quotes in parameter expansion
#700 xtrace output doubles backslashes and single-quotes
#698 Error with backslashes in unquoted variables with globbing off
#695 "${#:+\e}" should evaluate to \e, not e
#694 read only reads a single line even with a different delimiter
#690 ${var@a} to get flags, etc.
#688 ${@:0:1} evaluates to ${@:0}
#683 provide a way to query the version
#679 Run neofetch
#660 ${arr[0]=1} change variable to string rather than assigning cell
#651 cell sublanguage: unset -v 'a[0]' (ble.sh)
#648 Recursive arithmetic evaluation (ble.sh)
#640 arith assignment where var name is dynamic doesn't work
#291 single quotes within double quoted brace sub treated differently for the # ## % %% / operators
#273 Implement $(< file)
#254 Test the number of processes started by various shell snippets

Selected Open Issues

I'm still looking for more help with Oil. Related links:

Under the Hood: The Code Is in Good Shape After 4 Years

I'm still working on translating Oil to C++, which I mentioned in the March recap. One nice side effect is that it forces me to revisit and clean up the code.

For example, the optimizations to start fewer processes were a result of "pulling on a thread": a pesky fork_external parameter that I wanted to get rid of.

Dependency Inversion Leads to Pure Interpreters

Translation also encourages refactoring to dependency inversion, especially of I/O interfaces. This is because I/O is harder to translate than pure computation.

(I mentioned "dependency injection" in both the March Recap and the February Recap, but I now call it inversion. This is to avoid confusion with "DI frameworks", which aren't related to Oil.)

This refactoring will make "pure" subinterpreters possible, which relates to ble.sh (mentioned above), as well as to evaluating untrusted config files (more on this later).

Here are two comments I wrote about dependency inversion. They might help contributors understand Oil's code.

Although note that all contributors have implicitly followed the style. That is, there's nothing that unusual about it. And pull requests don't need to follow the style at first, as long as they have tests to ensure that later refactoring doesn't break anything.

Lexer Modes

I like the small size of these diffs, because it's evidence that the lexer mode technique is expressive enough to make subtle fixes to the OSH language:

(Related: How To Parse Shell Like a Programming Language summarizes our strict but powerful parsing model.)

Conclusion

I'm encouraged by our ability to make quick fixes to run the biggest shell programs in the world!

Please try Oil on your shell scripts and let us know what doesn't work. And let us know if you have questions about how to get started with the code.

The next post is: Oil Starts Fewer Processes Than Other Shells.

Appendix A: More Bad Parts of Shell

I stopped keeping track of #shell-the-bad-parts awhile ago, but this release brought to mind several more.

Appendix B: Metrics for the 0.8.pre4 Release

Let's compare this release with the previous one, version 0.8.pre3.

Test Results

Running big shell scripts led to a big increase in the number of OSH spec tests:

Not much work was done on the Oil language. I added failing tests to expose a few issues:

We have ~600 new lines of significant code, e.g.due to QSN, which I still need to write about.

And ~1200 new lines of physical code:

Benchmarks

These benchmarks didn't change, which is good. (They're noisy, which I'd like to eventually fix.)

Native Code Metrics

Let's concentrate on the in-progress oil-native translation, rather than the soon-to-obsolete OVM.

This release mainly refactored code, so the number of translated lines hasn't increased that much: