Why Sponsor Oils? | blog | oilshell.org
I published the Summer Blog Roadmap a few weeks ago, and it repeated what I wrote in January: I want to cut features to get the project "done".
Done really means "self-sustaining". In the best case, Oil won't ever be done, but it will be compelling enough that I'll have help from users.
This post reviews our goals and recent progress. Then I make comments on project milestones and technical risks. It may fall a bit short of "a plan", as you'll see.
There are 5 major parts of the project:
So the goal for 2020 was to "finish" OSH and performance (1 and 4), and polish a minimal portion of Oil (2).
We would cut the interactive shell, and large parts of the Oil language. Then we'd have a solid and improved shell for automation and programming, but a conservative one. After all, Oil is your upgrade path from bash (which I've emphasized on the home page).
Then we would work on docs in 2021, hopefully with some help.
But after going on vacation, and writing the idioms doc, I feel like working on the Oil language now. I find this list of benefits compelling, coherent, and within reach (and everything is still up for discussion):
read
and write
builtins.
Mostly designed, but not implemented.errexit
. Mostly designed, but not implemented.So Oil is taking a nice shape. However, polishing it will push the concrete milestone further out, in favor or something more open-ended. I don't like that. But on the other hand, the new benchmarks I discuss below are looking good, which makes me feel better. And the Oil project is supposed to be fun.
I don't know exactly what will happen, but this post and the next one will help me figure it out. Describing the technical risks might change my mind.
I worked on C++ translation, as well as features to run the brainfuck
interpreter, like mapfile -t
and printf -v a[i]
.
Writing new features in statically-typed Python and automatically translating to C++ is working well: the code is both short and fast. Although memory management is still an open problem, which I discuss below.
The progress on translation can be seen in the spec-cpp
results:
Our goal is to pass around 1641 cases, as the Python version of Oil does. This is a good trajectory, but progress isn't linear, and there's significant work to do after all test cases pass.
This commit gave a hint about performance. I was surprised that with the rough translation, and no optimization, Oil runs the brainfuck programs as fast or faster than bash does. For comparison, Oil's parser needed to be optimized in the "C++ domain" to surpass the speed of the bash parser.
I was curious about performance, so I wrote benchmarks, not only comparing the same program across shells, but with Python equivalents. They're published under /release/$VERSION/benchmarks.wwz/compute/, and I'll continue to improve them. I tested out loops, conditionals, integers, strings, arrays, hash tables, shell code in the wild, and more.
The summary is that Oil is already faster than bash, with the big caveat that it doesn't reclaim any memory. The benchmarks clearly show this! We're running long loops that allocate at every iteration, so Oil easily uses 10x more memory than bash. It's easy to make it crash with more iterations of the loop.
Deallocating memory is future work, and it will necessarily slow the interpreter down.
Even with that caveat, I still think this result is a bit surprising. Semantically, Oil is still completely unoptimized Python, though it's automatically translated line-for-line to C++. For example, there are no objects on the stack, because Python can't express that!
I think this could be because Oil's more static nature means it avoids parsing at runtime. I speculated on this in Problems With Multi-Stage Parsing (the 4th post on the blog, nearly 4 years ago!).
Now let's talk about upcoming releases.
Oil 0.8.0 is almost there. The spec-cpp
results above show that the
translation progress is working, and we're running real programs as well as
benchmarks.
Here a few things I want to do for this milestone, which may or may not happen:
source
builtin. (Trivia: source
is impure because it has to
use process.FdState
to open files, which avoids conflict with user
descriptors like echo 1>&3
.)ls
. We pass 863 tests without
starting any processes, and this would bring the total to well over 1000 out
of ~1650.In addition to starting simple commands, Oil 0.9.0 should run pipelines,
subshells, async processes with &
, and do redirects.
This should let it run:
configure
scripts from Python, OCaml, and TCC; and Alpine's
abuild
.neofetch
, one of the biggest shell programs in the
world.Note: I/O needs error handling with exceptions, which needs testing. APIs will be refactored.
This is the biggest open problem with oil-native. I had hoped to solve it before the end of the year, for Oil 0.9.0. But I haven't started, so that seems unlikely.
The current thinking is that I'll implement a dirt simple garbage collector, which can always be replaced later. I did way too much research on memory management, and I've come to realize I shouldn't spend any of our novelty budget here.
Notes:
Discarded ideas:
This is another technical issue that falls outside the "pure" interpreter.
I think it's OK to do after 0.9.0, since we're cutting the interactive shell,
and many shell programs don't use trap
.
So 0.9.0 will be compelling, but it isn't "done" or self-sustaining. We will likely need 0.10.0 and 0.11.0. The next post will speculate on those future releases, again describing milestones and risks.