Why Sponsor Oil? | blog | oilshell.org
This is the latest version of Oil, a Unix shell that's our upgrade path from bash:
Oil version 0.14.2 - Source tarballs and documentation.
We're moving toward the fast C++ shell (formerly oil-native), so there are two tarballs:
The C++ version doesn't exactly match Python, but it's getting close. We're also starting to use the "Oils for Unix" name, which I'll explain.
The wiki has tips on How To Test OSH. If you're new to the project, see Why Create a New Shell? and posts tagged #FAQ.
Readers have been asking about Oil, so let's start with the important info.
The compatible OSH shell is making great progress. It will be done, one way or another!
Reaching the garbage collector milestone opened up many parts of the project.
Last May, Oil Is Being Implemented "Middle Out" said that we have 1487 out of 1774 spec tests passing in C++.
As of this release, we have 1801 out of 1817 passing (C++ results). Most of the recent increase is due to Melvin Walls' great work translating the interactive shell to C++.
We can still use more help on OSH, and on the entire project.
The codebase and tools are improving and stabilizing, so please check out our Contributing page and list of issues.
Melvin had to suffer through a few "smells", but many of them are now fixed. His work all over the repo gives me confidence that more people can contribute.
The Oil language is on the table, but there's a lot left to do.
Unlike OSH, it's still tied to the Python interpreter. It runs, and you should try it, but I consider it a prototype.
But I'm excited about recent design breakthroughs we made on Zulip: on Python-like functions (hat tip to Kel), and on languages for data (QSN, tables, and records).
If you appreciate this work, please sponsor us:
We're using the donations to "on board" new contributors, before they're added to our NLnet grant.
This release has two highlights: the interactive shell in C++, and OSH changes for
But let's review the project first, since I've only written 2 posts in the last 6 months. This is mainly because I've been working with contributors under the grant. I'm talking to them, rather than "talking" on the blog!
So despite few release announcements, there have been steady releases this whole time, with hundreds of changes.
It's hard to remember everything that happened. The short story is that we've been working to fulfill the promise of the OSH part of the project, described in 2020's Four Features That Justify a Unix Shell. To recap, those are:
In 2021, I explained in several posts how the scope has always been a problem, and it's been changing. There are 7 parts to the project, each large:
My comment on
How do Nix builds work? (
jvns.ca via lobste.rs)
44 points, 6 comments on 2023-03-03
Let's move on to release highlights. The thing that most users will care about is that the interactive shell is working in C++! I'm using it on my machine now, running:
This is due almost entirely to Melvin, which is good news for people who have been wondering about Oil!
My comment on
Ask HN: Are alternative (oil, nu, etc.) shells usable as daily drivers? (self)
121 points, 141 comments - 27 days ago
In addition to crediting his great work in that reply, I clear up a couple misconceptions. One is that OSH is in fact a POSIX- and bash-compatible shell. The commenter was confused about OSH vs, Oil, which isn't uncommon.
So I plan to slightly rename the "Oil shell" project to "Oils for Unix", and the Oil language to YSH. OSH remains the same. I'll officially announce this in the next post, and elaborate on the motivation.
For more background on the interactive shell, see the the FAQ, in particular:
It would have been a shame to drop this part of the project, so I'm very glad that Melvin revived it. A great thing about shell is that the user interface and the language are intertwined, and support each other!
(Related: Unix Shell: Philosophy, Design, and FAQs).
To make this more concrete, see the informative README in the
rtx: Runtime Executor (asdf rust clone) (
github.com via lobste.rs)
42 points, 44 comments on 2023-02-25
In particular, it links to a good article on ASDF performance.
What I take away is that shells are powerful and universally-used interfaces for managing project dependencies, and the shell language itself should support this. Right now, these tools are slow, and have composition problems due to ordering, and can step on each other. They rely on bash hacks like mutating
$PROMPT_COMMAND and messing with your startup files.
Just like Nix,
rtx are pushing the boundaries of what our current shells are capable of.
If you have any concrete suggestions for OSH — or, even better, want to work on them — please get in touch.
The next release highlight is hard to explain, so let's take a break and credit more contributors. There have been hundreds of changes in the last few months, and it's easier for me to remember specific people than all the changes.
sig_handler_tcompile error on OS X, which we could still use help with. Others hit it on OpenBSD.
./configure, which I unfortunately dropped on the floor for awhile.
mylib::BufWriter, part of the GC runtime.
More people who tried Oil and reported bugs:
lukaswrz) reported a parsing bug with Oil expressions within command subs in issue 1387, now fixed.
urandom2) reported incorrect shell arithmetic parsing in issue 1446, now fixed.
alganet) reported that multi-level
continueweren't implemented in issue 1459. This is an obscure feature, but it wasn't too hard to add!
kseistrup) reported that
$_was missing in issue 1504, now implemented.
$_ variable contains the last word of the last command. I had never used it before working on Oil, but it's very handy with Ninja:
$ ninja _bin/cxx-dbg/osh && $_ -c 'echo hi' ninja: no work to do. hi
List<T>are mutable, while
Strinstances are immutable.
Some notes on performance: We're still allocating too much, which is a well-known peril of writing software like mathematics! I've fixed some low-hanging fruit, and my experience confirms that the two container optimizations will be important.
I also spent a lot time measuring the parser and interpreter with uftrace. Surprisingly, lists/vectors are more common than strings.
The shell arithmetic issue below also reminded me that Koiche Murase, author of ble.sh, originally implemented much of
shopt --set unsafe_arith_eval! We're still using that code, but we've relaxed it slightly. Thank you!
I probably omitted some contributions, so please feel free to ping me with yours, and I'll update this section. And let me know if you'd like to be credited in a different way.
The other highlight in this release is that shell arithmetic is more compatible with POSIX, due to autoconf's usage.
Thanks to Zack Weinberg for testing autoconf with OSH. Also see his great article:
This arithmetic issue goes back to 2019, and is hard to explain. Bear with me, or feel free to skip to the next section.
But, as of this release, we allow dynamic parsing in arithmetic. For example:
$ x='1 + 2' # var that looks like math $ echo $(( x )) # shells parse and evaluate strings as code 3 # there's no explicit 'eval'!
POSIX requires this in theory, and
autoconf requires it in practice.
I resisted this type of behavior for a long time — not just for usability, but also because OSH ended up being more secure than other shells due to its parsing philosophy.
eval/ Arbitrary Shell Execution
In particular, in 2019, I rediscovered a vulnerability in shells that have arrays. To be concrete, bash and zsh have arrays, but dash doesn't.
Even dash will evaluate your data as code, as in the example above. However, as long as it's confined to arithmetic, this is merely confusing, not dangerous. (Imagine if
print('1 + 2') in Python showed
3, rather than the string
1 + 2.)
In contrast, if you use say bash, an attacker who controls
x can execute arbitrary shell commands on your machine:
$ a=(1 2 3) # shell array $ x='a[$(echo 42 | tee PWNED)]=5' # variable with code in it # looks like an array index # with a command sub $ echo $(( x )) # arbitrary shell execution in bash, zsh, mksh! # not dash $ cat PWNED # 'echo 42' can also be 'rm -rf /' ! 42
Details at https://github.com/oilshell/blog-code/tree/master/crazy-old-bug. Stephane Chazelas, who discovered ShellShock, and the Fedora security team also warn about this issue.
So OSH disallowed all dynamic parsing unless
shopt --set eval_unsafe_arith. But that caused problems for autoconf. I believe
./configure scripts would fall back to the external
expr command with "stock" OSH.
We've now relaxed that option so
autoconf can run. But it still disallows arbitrary code execution:
osh$ echo $(( x )) a[$(echo 42 | tee PWNED)]=5 ^~ [ var ? at line 7 of [ interactive ] ]:1: fatal: Command subs not allowed here because eval_unsafe_arith is off
Does that mean we're compromising on the design of the Oil language? No, I also added
shopt --unset parse_sh_arith, which disallows shell arithmetic and thus dynamic parsing in Oil. So OSH now has dynamic parsing, but Oil still does not.
Instead of shell arithmetic, can use Oil's expressions over typed data, which includes integers.
$ x=$(( 1 + 2 )) # shell style, invalid in Oil $ var x = 1 + 2 # Oil style
You might ask why I'm blogging about this hidden
eval, rather than reporting it. Well, I reported it years ago to bash, OpenBSD ksh, and other shells. (OpenBSD was the only one that fixed it at the time. Others may have fixed it since then.)
Some some people already knew about it, and some people had a hard time understanding the report. A common response was:
Well that's how shell is. It allows you to execute shell commands.
— not an exact quote :)
In response, I say that POSIX shell is not like that. Shells like dash don't have the bug, because they don't have arrays. Try it.
There's a huge difference between code and data, both in computer science and in practical network security. A good shell should respect this difference. Again, this is one of Four Features that Justify a New Unix Shell.
When there were 10 Unix machines in the world, it was OK to be loose about code versus data. Even in the 1980's, every file on a Unix machine may have been provided by the manufacturer, or created by your coworkers. You could reasonably treat filenames as trusted data.
But today, you may download hundreds of megabytes of
git repos and package manager dependencies, written by thousands of people. So a shell should treat filenames and other external data as untrusted.
I'm now itching to work on the Oil language, but I also want the compatible OSH to be polished and "done".
So here's the call to action: please test Oil 0.14.2, and report bugs. Both the Python and C++ versions are ready to test.
Generally speaking, "batch" shell scripts should run under OSH, but interactive plugins may be more difficult. They are more tightly coupled to a specific shell.
The C++ version still fails 16 spec tests that the Python version passes (out of ~1800), but otherwise it's in pretty good shape.
Now that we have a pure C++ tarball, it would be great for someone to revive the work on running Nix shell scripts.
I expect more "conceding to reality", as with the shell arithmetic issue. But not too much, because we've fixed bugs like this for years. The latest bug reports have been great, and I'd like to see more testing, and get more help.
I've gotten feedback that it's hard to get started on the code. (Our Contributing wiki page describes how.)
Part of the problem is inherent in our metaprogramming approach. Again, Oil Is Being Implemented "Middle Out".
Another problem is that the codebase was something of an experiment for many years. In particular, the garbage collector was an "unknown unknown". (I didn't know what I didn't know about GC.)
But now that the shell works, the project feels "opened up" again. We are stabilizing and improving the tools. It didn't seem worth it to polish tools that didn't yet produce a working shell.
In particular, mycpp, ASDL, the build system, the test harnesses, and the CI are rapidly improving. I've collected Zulip threads that support this, like:
This long-running thread keeps track of problems:
I may elaborate later, but in the meantime, try building Oil, and ask me questions about the dev process!
I'll also repeat that recent contributions give me confidence that the codebase can have many hands in it, and will last a long time. In particular, Melvin has made large changes across Python and C++ code, wrapped native libraries like GNU readline, and fixed issues and design problems related to Unix signals and job control.
Last year, the C++ translation and the interactive shell were two big unknowns, and but they no longer are.
Are there any more fundamental issues blocking the project? In the last 2 months, I've been "kicking up dust" all over the repo to figure this out. Here are some of the bigger ones:
./configure, but not entirely correctly. The log output is different under OSH vs. other shells, and this is hard to debug.
helpbuiltin, and the location of startup files.
What's next? I've kept a backlog here:
At the very least, I want to publish a post about renaming the project:
I'm not looking forward to the extra work and churn, but I think these names will reduce confusion, and are better in other ways.
Again, we're using the money to bring in new contributors.
On the flip side, if you can get through Contributing, run
bin/osh -c 'echo hi', and test OSH, you might be a good person to work on Oil!
We last reviewed metrics in Oil 0.12.7 in October, so let's use that as our baseline.
The Python reference implementation is improving:
And the C++ translation is catching up:
Again, the majority of this was due to Melvin's work on the interactive shell.
On the other hand, work on the Oil language has stalled:
The parsing metric had a bug as of release 0.12.7, so let's use 0.12.9 as a baseline.
What's notable is that we turned on the garbage collection in this time! I have more plans to optimize the parser. It's representative of user workloads, and it's also a good stress test for the GC.
The C++ shell got much faster, and it's approaching the speed of bash on this difficult workload:
The executable spec remains small! Significant lines:
Code in the
oils-for-unix C++ tarball, much of which is generated:
Compiled binary size: