The current theme of this blog is to show how the oil parser works. But let's make sure we understand an important concept first: parse-time errors vs. runtime errors.
The way I write conditions in shell is like this:
if test -d /tmp; then
echo "/tmp is a dir"
fi
This is the same thing:
if [ -d /tmp ]; then
echo "/tmp is a dir"
fi
But yet another construct for conditional expressions is [[
. The Google
Shell Style Guide recommends using it, reasoning that:
[[ ... ]] reduces errors as no pathname expansion or word splitting takes place between [[ and ]].
[[ ... ]] allows for regular expression matching where [ ... ] does not.
That is, consider the following:
x='name with space.sh'
[ $x == *.sh ]
[[ $x == *.sh ]]
On the [
line, $x
will be split into 3 arguments. The glob *.sh
is also
expanded into multiple arguments, depending on what's in the current directory.
Both of these things cause the wrong number of arguments to appear on each side
of ==
. In contrast, the [[
expression will have exactly one argument on
the left and right of ==
. (It tests if $x
matches the pattern *.sh
,
which is true.)
What I didn't realize before implementing the oil parser is that this doesn't
quite capture the difference between [[
and [
. The more important
difference is that [[
is part of the shell language, while [
is a
builtin.
This means that an expression inside [[ ... ]]
is parsed up front, before
any code is executed. In contrast, the arguments to [
are parsed by the
builtin itself at runtime.
(In terms of parsing arguments, shell builtins behave like external commands.
The fact that they happen to live inside the /bin/sh
binary doesn't change
anything.)
The bash help doesn't capture this difference either: see help [
and help [[
. The parse-time vs. runtime distinction isn't mentioned.
Let's write some code to show this difference. First we create syntax errors by leaving off the right hand side of an equality test:
$ [ a == ] /bin/bash: line 1: [: a: unary operator expected
$ [[ a == ]] /bin/bash: line 1: unexpected argument `]]' to conditional binary operator /bin/bash: line 1: syntax error near `]]' /bin/bash: line 1: `[[ a == ]]'
On the face of it, these errors look similar. Now let's use the general
technique of wrapping them in if false
:
$ if false; then [ a == ]; else echo 'NOT PARSED'; fi NOT PARSED
$ if false; then [[ a == ]]; else echo 'NOT PARSED'; fi /bin/bash: line 1: unexpected argument `]]' to conditional binary operator /bin/bash: line 1: syntax error near `;' /bin/bash: line 1: `if false; then [[ a == ]]; else echo 'NOT PARSED'; fi'
bash
parsed the first statement without issue, and executed the else
clause. The stuff inside [
is just an opaque list of strings. We never
executed it and never parsed it.
In contrast, it emitted a parse error for the second statement, and didn't
execute any code. This is because [[
is actually part of the language.
Tomorrow we will use the same if false
technique to compare the oil parser
with popular shell parsers. We will see which errors they can catch at parse
time, and which errors have to wait until runtime.
oil has the philosophy that catching errors earlier is better. You don't want run a 4 hour script and get a syntax error after 3 hours.