blog | oilshell.org

The Simplest Explanation of Oil

2020-01-14

I'm writing a series of blog posts about Oil, and I want new readers understand what it is. I've tagged past explanations #why-a-new-shell, but they require background knowledge.

This tweet by Vanessa McHale is more relatable. It shows the frustration I felt working with shell for the first time 15 years ago, which many users still feel.

not wrong pic.twitter.com/h1DTpeXiWJ

— Vanessa McHale (@vamchale) November 2, 2019

(bigger image)

Table of Contents
Productive Programmers Secretly Use Shell
Oil Might Look Familiar
Assignment: Oil has keywords and expressions
Oil uses curly braces rather than if, fi
Prefer Oil's function calls to ${var%arg}
Words aren't split by default, so they don't need quotes
Oil had operators rather than -lt and -gt
Oil Is For Programmers Who Avoid Shell
And For Non-Programmers
Summary
Appendix A: Problems with ${}
Appendix B: It's Not Just Syntax

Productive Programmers Secretly Use Shell

For context, my second job was at Google, and when I joined I had never used Unix! I grew up with Windows, and my first job was in console game development. In college, I liked theory more than programming.

So I had never seen a shell script. But in the next decade I noticed that many programmers at Google leaned heavily on Unix and shell. Learning those dark arts helped me "get things done".

Oil Might Look Familiar

But shell is unnecessarily hostile, and Oil aims to fix that.

The glossary says that the Oil language is a new dialect of shell that's parsed and evaluated like Python or Javascript, as opposed to being a "macro processor".

Responding to the frustrations in the tweet will make that more concrete.

Assignment: Oil has keywords and expressions

Whitespace matters in shell, so these two lines have totally different meanings:

sh$ a=b    # assign the variable 'a' to the string value 'b'

sh$ a = b  # run the command 'a' with two arguments: '=' and 'b'
a: command not found

I thought this was unbelievably ugly 15 years ago. And it's still annoying, if only because it's inconsistent with every other language I use.

Oil looks more familiar:

oil$ var a = 'b'         # declare and initialize a variable
oil$ set a = "hello $a"  # mutate it

The keywords var and set change the lexer mode and invoke an expression parser. (There's also setvar which avoids a name conflict and is easier to use interactively.)

Oil uses curly braces rather than if, fi

To delimit blocks, shell uses do and done, if and fi, etc.

for path in /bin /tmp; do
  if test -d "$path"; then
    echo "$path is a directory"
  fi
done
# Output:
/bin is a directory
/tmp is a directory

Oil uses curly braces everywhere, but otherwise it's still shell-like:

for path in /bin /tmp {
  if test -d $path {
    echo "$path is a directory"
  }
}

Prefer Oil's function calls to ${var%arg}

In shell, cryptic punctuation within ${} is used for simple string manipulation:

sh$ path=/bin/foo.py

sh$ echo ${path#/bin/}  # strip a prefix with #
foo.py

sh$ echo ${path%.py}    # strip a suffix with %
/bin/foo

Oil still uses $path or ${path} to refer to a string variable, but operations are expressed with function call syntax:

oil$ var path = '/bin/foo.py'

oil$ echo $lstrip(path, '/bin/')   # strip a prefix
foo.py

oil$ echo $rstrip(path, '.py')     # strip a suffix
/bin/foo

See Appendix A for more problems with ${}.

(The previous examples work with the latest release, but this one doesn't. Function calls are parsed and evaluated, but the strip functions don't exist yet.)

Words aren't split by default, so they don't need quotes

Oil fixes what I call the "!QEFS problem". Shell forces you to quote everything:

sh$ path='my blog post.txt'

sh$ ls $path     # wrong: ls will be passed 3 arguments, not 1
ls: cannot access 'my': No such file or directory
ls: cannot access 'blog': No such file or directory
ls: cannot access 'post.txt': No such file or directory

sh$ ls "$path"   # quoting disables word splitting
-rw-rw-r-- 1 andy andy  2091 Jan 13 01:51 my blog post.txt

Oil doesn't require quotes:

oil$ var path = 'my blog post.txt'

oil$ ls $path  # no splitting occurs
-rw-rw-r-- 1 andy andy  2091 Jan 13 01:51 my blog post.txt

But you can explicitly call @split() if you want:

oil$ lines @split(path)  # function that prints one arg per line
my
blog
post.txt

The @ sigil means "array". In shell, word splitting is a poor substitute for arrays, but Oil has proper arrays.

The literal syntax is @(...), and you interpolate words into a command with @myarray:

oil$ var posts = @('my blog post.txt' 'other post.txt')

oil$ ls -l @posts
-rw-rw-r-- 1 andy andy  2091 Jan 13 01:51 my blog post.txt
-rw-rw-r-- 1 andy andy    74 Jan 19  2019 other post.txt

(Related Hacker News thread)

Oil had operators rather than -lt and -gt

In the 2019 FAQ, I showed that Oil uses < and > rather than -lt and -gt.

However, I've reduced the scope of the Oil project in order to get it done, and true integer types might be on the chopping block. But I'm still looking for feedback and contributions to Oil!

Oil Is For Programmers Who Avoid Shell

This small exchange on lobste.rs also explains the project:

| Oil is also aimed at people who know say Python or JavaScript, but purposely avoid shell

I’m one of those people! I didn’t realize I was in the target group for Oil until now :)

And For Non-Programmers

Many Unix users learn shell before they learn to program. It's a gateway into programming:

  1. The REPL is a proven interface for fast experimentation.
  2. Shell programs are often concrete, making them easy to read and trace. Your first programs don't need any abstraction, and you can print shell's simple data types.

If shell is useful in these ways, then Oil will be even more useful. It gives more errors, which aids learning, and those errors are more precise. I wrote about syntax errors a few years ago and runtime errors over the summer, but there are many more examples.

It's odd that shell is the first language you encounter on a Unix machine, but you're usually encouraged to "skip over" it and launch a different one!

Related: Shouldn't scripts over 100 lines be rewritten in Python or Ruby?

Summary

Vanessa's tweet is only a couple months old, but these problems have been bothering shell users for decades.

If you want to read more about Oil, subscribe to /r/oilshell or follow @oilshellblog. The next posts will review what the Oil project has achieved.


Thanks to Eric Higgins and Aaron Sokoloski for reviewing drafts of this post.

Appendix A: Problems with ${}

I want a smooth transition path from shell to Oil, so I implemented the syntax within ${} to run existing shell scripts. This led to several posts tagged #shell-the-bad-parts:

Appendix B: It's Not Just Syntax

I wrote about other syntactic issues, like static vs. dynamic parsing, but shell has many semantic problems too.

I want to write a post called Shell Has Context-Sensitive Evaluation based on this thread. Summary: it's hard to explain what "$@" and "${myarray[@]}" mean in shell because it depends on where they're used.

There are many more syntactic and semantic problems on the Shell WTFs wiki page. I'll write about some of them, but it's more important to me to get Oil done.