Why Sponsor Oils? | blog | oilshell.org

A Plan for Oil 0.8 and 0.9

2020-08-12

I published the Summer Blog Roadmap a few weeks ago, and it repeated what I wrote in January: I want to cut features to get the project "done".

Done really means "self-sustaining". In the best case, Oil won't ever be done, but it will be compelling enough that I'll have help from users.

This post reviews our goals and recent progress. Then I make comments on project milestones and technical risks. It may fall a bit short of "a plan", as you'll see.

Table of Contents
Recap of Our Goals
I'm Excited About the Oil Language
Recent Progress
Oil 0.8.pre8 - oil-native Passes More Spec Tests
Oil 0.8.pre9 - New Benchmarks Are Looking Good
Version 0.8.0 - A Pure Interpreter
Version 0.9.0 - Shell I/O
Technical Issue: Memory Management
Signal Handling
Summary

Recap of Our Goals

There are 5 major parts of the project:

  1. The compatible OSH language. This took years of effort, but there's no "risk" left. Oil is the most bash-compatible shell, by a mile. It has new features, and helps you with many of shell's sharp edges (wiki).
  2. The new Oil language. I just drafted a new doc on Oil language idioms, and it's given me new energy for the project. The language is looking good, although it's probably closer to 40% of the way to a solid 1.0 than 80%.
  3. The interactive shell, and the ability to set OSH as the system (root) shell.
  4. Improving performance. This is its own subproject because Oil is implemented in an unusual style, with DSLs and metaprogramming.
  5. The documentation for everything above. This is a big effort, and it deserves its own category.

So the goal for 2020 was to "finish" OSH and performance (1 and 4), and polish a minimal portion of Oil (2).

We would cut the interactive shell, and large parts of the Oil language. Then we'd have a solid and improved shell for automation and programming, but a conservative one. After all, Oil is your upgrade path from bash (which I've emphasized on the home page).

Then we would work on docs in 2021, hopefully with some help.

I'm Excited About the Oil Language

But after going on vacation, and writing the idioms doc, I feel like working on the Oil language now. I find this list of benefits compelling, coherent, and within reach (and everything is still up for discussion):

So Oil is taking a nice shape. However, polishing it will push the concrete milestone further out, in favor or something more open-ended. I don't like that. But on the other hand, the new benchmarks I discuss below are looking good, which makes me feel better. And the Oil project is supposed to be fun.

I don't know exactly what will happen, but this post and the next one will help me figure it out. Describing the technical risks might change my mind.

Recent Progress

Oil 0.8.pre8 - oil-native Passes More Spec Tests

I worked on C++ translation, as well as features to run the brainfuck interpreter, like mapfile -t and printf -v a[i].

Writing new features in statically-typed Python and automatically translating to C++ is working well: the code is both short and fast. Although memory management is still an open problem, which I discuss below.

The progress on translation can be seen in the spec-cpp results:

Our goal is to pass around 1641 cases, as the Python version of Oil does. This is a good trajectory, but progress isn't linear, and there's significant work to do after all test cases pass.

This commit gave a hint about performance. I was surprised that with the rough translation, and no optimization, Oil runs the brainfuck programs as fast or faster than bash does. For comparison, Oil's parser needed to be optimized in the "C++ domain" to surpass the speed of the bash parser.

Oil 0.8.pre9 - New Benchmarks Are Looking Good

I was curious about performance, so I wrote benchmarks, not only comparing the same program across shells, but with Python equivalents. They're published under /release/$VERSION/benchmarks.wwz/compute/, and I'll continue to improve them. I tested out loops, conditionals, integers, strings, arrays, hash tables, shell code in the wild, and more.

The summary is that Oil is already faster than bash, with the big caveat that it doesn't reclaim any memory. The benchmarks clearly show this! We're running long loops that allocate at every iteration, so Oil easily uses 10x more memory than bash. It's easy to make it crash with more iterations of the loop.

Deallocating memory is future work, and it will necessarily slow the interpreter down.

Even with that caveat, I still think this result is a bit surprising. Semantically, Oil is still completely unoptimized Python, though it's automatically translated line-for-line to C++. For example, there are no objects on the stack, because Python can't express that!

I think this could be because Oil's more static nature means it avoids parsing at runtime. I speculated on this in Problems With Multi-Stage Parsing (the 4th post on the blog, nearly 4 years ago!).

Version 0.8.0 - A Pure Interpreter

Now let's talk about upcoming releases.

Oil 0.8.0 is almost there. The spec-cpp results above show that the translation progress is working, and we're running real programs as well as benchmarks.

Here a few things I want to do for this milestone, which may or may not happen:

Version 0.9.0 - Shell I/O

In addition to starting simple commands, Oil 0.9.0 should run pipelines, subshells, async processes with &, and do redirects.

This should let it run:

Note: I/O needs error handling with exceptions, which needs testing. APIs will be refactored.

Technical Issue: Memory Management

This is the biggest open problem with oil-native. I had hoped to solve it before the end of the year, for Oil 0.9.0. But I haven't started, so that seems unlikely.

The current thinking is that I'll implement a dirt simple garbage collector, which can always be replaced later. I did way too much research on memory management, and I've come to realize I shouldn't spend any of our novelty budget here.

Notes:

Discarded ideas:

Signal Handling

This is another technical issue that falls outside the "pure" interpreter. I think it's OK to do after 0.9.0, since we're cutting the interactive shell, and many shell programs don't use trap.

Summary

So 0.9.0 will be compelling, but it isn't "done" or self-sustaining. We will likely need 0.10.0 and 0.11.0. The next post will speculate on those future releases, again describing milestones and risks.