Why Sponsor Oils? | blog | oilshell.org

More Changes to Oil's Syntax

2020-11-08

The recent 0.8.3 and 0.8.4 releases were so big that it's taking five blog posts to describe them!

  1. Big Changes to the Oil Language: I published this post about the expression language a couple weeks ago.
  2. More Changes to Oil's Syntax. This post describes more syntactic changes, including enhancements to shell builtins.
  3. Proposed Changes to Oil's Syntax. New constructs I left syntactic space for. We cleaned up a lot, but there are still more warts in shell.
  4. Changes to Shell Runtime Semantics. I overhauled shell options, variable scope, and proc.
  5. The Shell Programmer's Guide to errexit. About error handling in shell and Oil.

(And that doesn't count yesterday's metrics post, which isn't essential.)

A quick story that motivates these changes: A few months ago, I wrote the first draft of Oil Language Idioms. That led to more TODOs than expected, and to more docs, like Shell Language Idioms.

Since then, I've been knocking off the TODOs. So we're basically doing documentation-driven development.

The purpose of this post is to get feedback about the Oil language. That said, I also want to spend time on official documentation, so I may breeze through this quickly. Please send feedback in the comments, on Zulip, or on Github.

Table of Contents
Special Variables and Functions: _status, _match()
Stricter Syntax
Enforced parse_backslash in Unquoted Words
Syntax Error For @(seq 3)trailing
Keywords and Operators
The = and _ "Pseudo-Assignment" Keywords
Removed pass Based On Your Feedback
The ~== Operator for Approximate Equality
Doc Comments Like ### Are Now Recognized
Many Builtin Commands Enhanced
repr Renamed to pp (pretty print)
Added Long Flags: shopt --set, test --dir
Structured I/O: read --line, write --qsn
Added Block Arguments: shopt, fork, forkwait
Conclusion
Appendix: Issues Closed in 0.8.3

Special Variables and Functions: _status, _match()

Shell has special variables like $? and ${BASH_REMATCH[@]}. They are implicitly mutated by the interpreter.

Oil supports them, but I decided that we need a more consistent style with less punctuation and CAPS.

A leading underscore gives these variables their own namespace, and lower case makes them easier to type.

Stricter Syntax

Enforced parse_backslash in Unquoted Words

This is like the string literal changes mentioned in the last post.

See these nice tables! https://github.com/oilshell/oil/issues/860

Summary:

Syntax Error For @(seq 3)trailing

All constructs beginning with a @ sigil must occupy a whole word. There's no implicit joining as with bash:

$ echo x"$@"y   # this does weird things

The @ constructs are:

$ echo @myarray
$ echo @(split command sub)
$ echo @array_func(x, y) @glob(pat) @split(s)

Keywords and Operators

The = and _ "Pseudo-Assignment" Keywords

This was done in a previous release, but deserves mention here. Shell assignments take expressions on the RHS:

var x = 42 + f(x)

You can pretty-print an expression like this, which is useful in the REPL:

= 42 + f(x)    # evaluate and pretty print

You can also ignore the result of an expression:

_ 42 + f(x)    # think of this line as a shortcut
_ = 42 + f(x)  # for this line

This is useful for functions with side effects:

_ mylist.append(x)
_ mylist.extend(['str', var])

However we also have a shell style:

push :mylist str $var

I don't expect _ to be used that often in real code. Functions usually return values and are used like echo $len(x).

Removed pass Based On Your Feedback

The pass keyword was intended for left-to-right function calls as in dplyr.

However, many people mentioned that it conflicted with existing programs. And we can use the _ keyword instead.

So I removed it. I listened to your feedback!

The ~== Operator for Approximate Equality

Oil has typed data, so this operator will help us be as convenient as Awk, while avoiding the pitfalls of JavaScript's ==:

var mystr = '42'  # string that could be read from a file

if (x == 42) {    # Error: Can't compare values of different types
  echo 'yes'    
}

if (x ~== 42) {   # True: '42' ~== 42
  echo 'yes'
}

We might also use this operator for approximate floating point comparisons. I could use help on this!

Doc Comments Like ### Are Now Recognized

The parser now recognizes doc comments and attached them to the AST. It's the first line after an opening { with ###.

proc restart(pid) {
   ### Restart web server by sending it a signal

   kill $pid
}

It also works for shell-style functions:

f() {
   ### Restart web server

   kill $1
}

We can use this for autocompletion and more. Feedback is welcome.

Many Builtin Commands Enhanced

repr Renamed to pp (pretty print)

(1) pp cell pretty-prints cells, which are the locations of variables.

Cells have flags like -x (export). This builtin is very useful for debugging shell programs!

osh$ export FOO=bar
osh$ pp cell FOO
FOO = (cell exported:T readonly:F nameref:F val:(value.Str s:bar))

(This format isn't stable yet. See issue 817).

(2) pp proc shows doc comments. It prints a table, which means it's the first usage of QTSV in Oil!

osh$ pp proc
proc_name       doc_comment
f       'doc \' comment with " quotes'
g       ''

Now we need a QTSV_PAGER, i.e. something like less for tables. I recently learned that we can do a quick and dirty job with column.

Added Long Flags: shopt --set, test --dir

Other enhancements to shopt and test:

The idiom for turning on Oil is now:

shopt --set oil:basic  # options unlikely to break code
shopt --set oil:all    # everything

Or you can use bin/oil to turn on oil:all.

Structured I/O: read --line, write --qsn

These changes were discussed in Four Features That Justify a New Unix Shell. We want to remove the need for ad hoc parsing and splitting.

So we have preliminary QSN support, but we still need more QTSV support.

Another idea for a primitive:

Added Block Arguments: shopt, fork, forkwait

This came directly out of Oil Language Idioms. Oil has a more consistent syntax:

sleep 2 &             # shell style
fork { sleep 2 }      # Oil style

( sleep 2 )           # shell style
forkwait { sleep 2 }  # Oil style

This allows us to use & for redirects, and () for expressions.

We use the same block syntax to save and restore state:

shopt --unset errexit {
  step1
  echo $?

  step2
  echo $?
}

This was enabled by a pleasant refactoring of the "option stack", which was aided by static typing. This stack is now used consistently for many purposes:

  1. shopt blocks
  2. The broken POSIX errexit semantics. Error handling is disabled in the constructs if / while / until / && || !. We unfortunately have to implement this.
  3. The run builtin, which undoes this bad behavior by re-enabling errexit.
  4. The strict_errexit option, which detects code that would lose errors.
  5. Disabling dynamic scope in procs, which I describe in an upcoming post. The option stack follows the call stack.

Conclusion

This was post 2 of 5 that explains the Oil 0.8.3 and 0.8.4 release. I wrote down what I think Oil's idioms should be, and then I implemented them!

Please try Oil 0.8.4 and let me know what happens!

Appendix: Issues Closed in 0.8.3

These issues were closed for Oil 0.8.3, and I discuss most of them in this series of posts.

#835 Make expression language compatible with Python
#826 clarify QSN use cases
#775 errexit not disabled where it should be
#735 remove 'pass' builtin
#713 long flags for shopt builtin, e.g. --set and --unset
#711 Oil should have a slurp builtin
#582 QSN serialization format: parser, printer, builtin
#501 shopt should respect set -o options
#476 consider a different definition of strict_errexit