Why Sponsor Oils? | blog | oilshell.org
This post has #comments and #zulip-links about the Oil and OSH languages. It's fourth post in the Winter Backlog series, which I hope will maintain continuity while I focus on expanding the project with outside help.
(The first three were Recent Progress, Explaining the Project, and Rough Progress Assessments.)
I'd like to answer these questions more authoritatively, but the "narrative style" with links has proven effective. And most topics came from reader questions anyway. Leave a comment if anything is unclear!
This is a common question, which I was asked on Zulip:
and in the long Nix RFC thread. So I create this wiki page:
Wiki: OSH versus Oil
A short answer is that OSH is compatible shell-like stuff and Oil is new, Python- and Ruby-like stuff, and there isn't a sharp line between them.
Let me know if anything is unclear, and I'll update the wiki page. Also see other posts tagged #FAQ.
I was asked what option groups are recommended for a new OSH script:
Even though this doc has a pink warning for "under construction", it has a good answer:
But I gave an even more concise answer on Zulip:
strict:all
. Use this if you want to run the same script under multiple
shells, like bash and OSH.oil:basic
. Use this if you're upgrading existing script, and dropping
compatibility with other shells. You can use new Oil features, but you won't
have to change your existing code too much.oil:all
. Use this for a brand new program.
bin/oil
rather than bin/osh
.No! You don't need Python to build or use Oil.
All the source code you need is in the release tarball, which builds with C++
compiler and make
. (Remember that OSH and Oil are in the same tarball and
executable.)
Each page like /release/0.9.5/ has two different releases:
oil-$VERSION.tar.gz
oil-native-$VERSION.tar.gz
The first one is the slow "executable spec" -- it reuses parts of the Python
interpreter, and contains Python code. But I took great
pains to make this invisible. You
just run ./configure
and make
, without Python.
The second is the fast interpreter in pure C++, but it's not ready for use yet.
Unfortunately, we will need to rename these. That is, oil-native
should be oil
or oilshell
. The first tarball could be oil-python
,
oil-reference
, or oil-experiments
.
The three questions below aren't FAQs, but they may help people understand the relationship between OSH and Oil.
This relationship has evolved -- they used to be more like two different "worlds", but now they're more unified. The upgrade path is gradual, not sudden!
A user on lobste.rs was confused by Oil's syntax, and my explanation may be worthwhile.
Let's start with these "shell axioms":
# variable substitution with $ and ${}
echo $mystr ${mystr}
# command sub with $()
touch "$(my-command)"
# command block with { }
{ echo hi; my-command; } > out.txt
(Notice that shell already has some inconsistency between $()
and { }
.)
I claim that Oil's new "sigil pairs" are natural extensions:
echo $[42 + x] # expression sub with $[]
ls @myarray # splice array into command with @
# Inspired by "${myarray[@]}" and Perl.
ls @(my-command) # split command sub with @()
# An array of "word" literals with %()
# Note that % doesn't mean hash as it does in Perl.
const x = %(foo bar *.py)
More notes:
^(echo hi)
is a rare syntax for an unevaluated block.
$(echo hi)
and the unevaluated expression syntax
^[42 + x]
.{ echo hi }
, but $(echo hi)
already has that
problem too!@{x|html}
and @[split(x)]
. These would be rare, but
they're consistent.Related:
There is some "legacy" shell syntax that I've decided to keep.
Remember that I'm trying to cut the scope of the project! And I also noted that the combined OSH + Oil language size should be minimized. That's a principle that's become more important since the early days of the project, when we had two separate worlds.
C-Style strings. The $
prefix annoys me because $myvar
also uses
it, but they mean different things. But I tried to add c'\n'
, and it was
too complicated and inconsistent.
echo $'\n'
Redirects. I don't like shell's redirect syntax, but the ugly cases aren't common.
echo 'error message' >&2
This weird bash syntax for assigning FDs to variables is occasionally useful, and we're also keeping it:
myproc {left}< left.txt {right}< right.txt
Process Sub. These unfortunately look like redirects, but they're actually "sigil pairs".
diff <(sort left.txt) <(sort right.txt)
That is, they're analogous to $(sort left)
, @(sort left)
, and ^(sort left)
!
I came across an insightful Hacker News comment that recommends reading shell redirect syntax as assignments.
This indeed matches what the dup2()
system call does! It's like an
assignment statement for "pointers" to file structs in the kernel. The
programming model is imperative.
But I think it's better to just memorize a few canned patterns.
These patterns cover 99% of cases:
echo 'my message' 1>&2 # message to stderr
ls > out.txt # stdout to a file
sort < in.txt # stdin from a file
sort < in.txt > out.txt # both
I also use this idiom:
mycmd 2>&1 | wc -l # stdout and stderr to pipe
And this one, which is the annoying case where order matters:
mycmd >file.txt 2>&1 # stdout and stderr to file
That's about it. Remember, Avoid Directly Manipulating File Descriptors in Shell. If you find yourself saving and restoring descriptors, you should be using shell functions instead:
myfunc > output
That does the same thing. So those patterns are all you need -- really! If you have a counterexample, let me know.
I answered 4 common questions about OSH and Oil, and then summarized 3 comments on the language design.
Let me know if you have questions!
Here are short answers to some questions that came up on the Nix RFC thread. I think oil-native is the main blocking issue for Nix, so these answers are not particularly important. But some readers may be curious.
... in the sense that every Oil script can be executed by bash?
We don't have a mode for that, although it's possible in theory. OSH and Oil are "stricter" than bash, but they also have new functionality that won't run under bash, like Simple Word Evaluation.
Posts tagged #real-problems explain some of these new features.
I think the last question could be better answered by diagrams to explain the relationships. For now, here are some notes.
sh
versus bash
:
sh
with constructs like arrays, [[
for logical
tests (including regexes), and ${x//pattern/replace}
.set -o posix.
It's a myth
that bash's additional features make it non-compliant! If you want to
write a portable script, you should test your script under two shells, like
bash and OSH.bash
versus osh
:
osh
versus oil
:
test --dir
, and options like shopt --set simple_word_eval
.const myint = min(3, 4)
and Ruby-like blocks cd /tmp { echo $PWD }
.Again, the terms "OSH" and "Oil" are fuzzy because they've evolved over time. I used to think of simple word evaluation as an OSH feature, but now it seems to logically belong in Oil.
I would call it production-ready when we have the faster oil-native build. However, many people tell me they already use Oil and like it.
Practically speaking, migrating to OSH is the first step. For many bash programs that are thousands of lines, the migration is trivial -- just run it with OSH instead of bash. Try it and let me know what happens! Is it too slow?
Even if you migrate to OSH and not Oil, there are benefits. This post last year mentions:
This is all implemented and done! It just needs to be faster.
Remember that Oil doesn't require Python 2 or 3 to build or use, in any form. It's packaged in several distros and none of them require Python to build.
That said,
But this doesn't mean you need Python to build or use Oil. More analogies:
The generated C++ code is readable and I debug it directly with GDB, and use normal profiling tools on it. That is all by design. It's more readable than the output of yacc (which is a bunch of parsing tables).
No, it contains both hand-written and generated source code.
oil-0.9.3.tar.gz
has generated Python code and a slice of the Python interpreteroil-native-0.9.3.tar.gz
has no Python code at all, only C++.Note: The argument below is mostly academic, since it was discovered that in Nix, bash is built from its tarball, not its repo. But it was a long conversation, and this issue came up with Guix as well, so I've copied the answer here.
I can see the appeal of having packages consistently use the Nix build system from the git repo -- for patching, and for Nix maintainers to understand.
That is OK, and I've accepted patches to make this a reality. But a Nix build of Oil will always have to be maintained in parallel with our own shell script build. Because a shell is at a lower level than a package manager!
That is, a shell can build and boot an entire Unix system without a package manager. But a package manager can't do that without a shell. (This is related to my interest in the now-defunct Aboriginal Linux early in the project.)
So a shell having a build dependency on a package manager is inverted, which is why I don't take that dependency. How do you build the package manager itself?
Also note that Oil has the same goals as Nix with respect to reproducibility -- the build is very deterministic and automated, but it doesn't use Nix.
Another point: To make it easier to bootstrap Nix, I think you should avoid "bootstrapping" Oil. At the bottom levels of a Unix system, there will always be circular build dependencies. It's just a matter of where you want to "cut it off".
If you port an existing program to the large common subset of OSH and bash, this won't happen. You can always run it with bash.
If you start a new program in Oil, this won't happen.
But it is possible if start porting to Oil, but don't finish! I can imagine this happening if not enough people understand shell and Oil. I've found that there are often few knowledgeable maintainers of shell in Linux distros.
Oil is a simpler, cleaner language, but it still takes work to use the features and improve the code.
Someone got the idea that Oil is 2 million lines of code! This is false.
It's smaller than bash, and well under 100K lines of code any way you count. Search for "metrics" on any release page and look at various line counts:
It's designed to be small and comprehensible (as much as a bash compatible shell can be). The core is less than 20K significant lines of code, which is 5-7x smaller than bash: