blog |

You Can Now Try the Oil Language

2019-10-04 (Last updated 2020-08-12)

In August, I published early design notes for the Oil language.

Since then, I've continued prototyping it, and I started at least 30 design discussions on I appreciate all the feedback, and I'm looking for more!

I just released Oil 0.7.pre5, which contains all this work. Here's a summary:

  1. The syntax of Oil is mostly done. I added a Python-like expression language to a compatible shell. Unlike shell, Oil has powerful data types like dictionaries, lists, tuples, ints, and floats. Functions operate on any of these types.
  2. The semantics are in a pretty good state, but scope and error handling aren't worked out yet. I reused the spec test framework for Oil, and almost 200 Oil spec tests now pass.
  3. Everything still up for discussion. This is an early release.
  4. The language ready to try, but not ready to use :-) On the other hand, you can use this release to run existing shell scripts. I dogfood Oil, and its stricter semantics have caught several problems in my own shell programs.

There are some new docs like the Eggex manual, but overall, the code is currently way ahead of the documentation.

This post gives an outline of the docs I want to write. Feel free to ask questions on Reddit or on Zulip! They'll help me decide what to write about.

Table of Contents
Docs I Want to Write
Oil From One Million Feet
Small vs. Big Languages
Oil from 10,000 feet
OSH vs. Oil
Command vs. Expression Mode
What's Next?
Appendix A: Source Code Files
Appendix B: Metrics for Release 0.7.pre5
Native Code and Bytecode Metrics

Docs I Want to Write

There are drafts of many of these docs on Zulip.

Oil From One Million Feet

One way to explain Oil is by comparison to other languages.


Small vs. Big Languages

Oil is a big language because it's meant to "subsume" other languages. For example, it contains all of shell, and much of Python, and they're both big languages.

On the other hand, Oil's implementation is smaller than bash or Python.

I still believe that Shell, Awk, and Make Should Be Combined, although the strategy for getting there has changed.


Oil from 10,000 feet

I drafted a blog post which covered each language feature, the rationale for its design, and outlined future work. Here's the table of contents:

It's clear I need to split this into many docs!

OSH vs. Oil

In past releases, OSH has been concrete — you can run your shell scripts with it — but Oil has been vague.

That's now changed! This release has an oil executable, which is a busybox-like symlink to the "app bundle".

Here's how it works: bin/oil is just bin/osh with the addition of shopt -s oil:all. The option group oil:all is a shortcut for around 10 parsing and execution options which gradually upgrade OSH to Oil.

In the last post, I explained why Oil is now a dialect within OSH. Essentially, I realized that the strategy of creating two different "worlds" makes the shell both harder to implement and harder to use.

The goal of Oil is unchanged: it's your upgrade path out of bash. That path is more seamless if there's a single binary and a single language with a few options.

I'll also describe the oil:basic option group, which lets you use Oil features, but minimizes the breakage in existing shell scripts.

Command vs. Expression Mode

This is an essential syntactic concept. An Oil program starts in command mode, and commands are composed of words:

echo "hello $name"
ls | wc -l

However there are several keywords and sigils that put you in expression mode, e.g. so that * means multiplication rather than glob:

var x = 1 + 2*3 + f(x)  # After =, you're in expression mode
= myfunc(42, 'foo')     # Pretty-prints the result, without assigning

Inline calls also put you in expression mode:

echo $strfunc(1 + 2*3)  # Between (), you're in expression mode
echo @arrayfunc(x, y) 

There are also expression substitutions with $[expr]:

echo "attr = $[obj.attr]"
echo "key = $[d->key]"
echo "item = $[array[1 + 2*3]]"


If you want a peek at what I'll be writing about, I maintain a Blog TODO thread on Zulip.

Feel free to start new topics with questions. They'll help me decide what to write about. (The "New Topic" button is at the bottom of the screen. Click a message body to reply under the same topic.)

This comment also lists some interesting threads, and things I'm looking for feedback on. At some point I'll post a summary to Zulip, since I know it's a lot to read.

Also see blog posts tagged #oil-language!

What's Next?

This is the general feature set I want for "V1" of Oil. The details will change based on your feedback, but I think the "foundation" of the language will converge pretty soon.

After that, I plan to tackle the riskiest part of the project: Oil is still too slow!

To fix this, I plan to resume the mycpp work I started in April. I think it will yield a reasonable speedup with a reasonable amount of engineering effort, but there's no guarantee.

If it doesn't, I explained on Zulip that Oil Is Made of Ideas. It's Not a Pile of Code in a Particular Language.

So you can reimplement it in another language. Its source code is significantly smaller than bash — i.e. I "compressed" and cleaned up bash for you. And I also added a new and powerful expression language on top!

Appendix A: Source Code Files

Toward that end, I started publishing key source files at the bottom of the each release page. Summary:

More on this later. (Or ask me about it on Zulip.)

Appendix B: Metrics for Release 0.7.pre5

Let's compare the current release with version 0.6.0, released three months ago on July 1st.

There are 115 new tests passing:

In addition, we now have almost 200 Oil spec tests passing:

There are ~1500 new significant lines of code in OSH:

And ~2500 new lines of physical code in OSH:

In the oil_lang/ directory, there were 1,175 physical lines, and now there are 3,655. This number isn't that meaningful because some of Oil is in the frontend/ directory.

Nevertheless, it's a good sign that Oil is still a small program, even with the addition of the large Oil language. It lets me make aggressive, global changes to the codebase.

The small size is largely due to the use of the domain-specific languages mentioned above.

Native Code and Bytecode Metrics

I restored Python's floating point support to the Oil build, so the amount of native code increased:

The binary size also increased:

As well as the bytecode size:

These are minor differences compared to the optimizations and reductions I hope to make in the coming 6 to 12 months.


Thanks to Ilya Sher and Kartik Agaram for great discussions on the Oil language. I'm still looking for more feedback!