I recently implemented the test builtin, also known as [. Since I had
already implemented the [[ variant in OSH, I thought this
would be straightforward.
But as always, shell is full of surprises. In this post, I give examples of
ambiguous test expressions. Then I describe the curious algorithm that
bash/POSIX uses to resolve ambiguities, as well as an example where this
algorithm breaks down.
You can consider this another episode of Shell: The Bad Parts.
Recall the difference between [ and [[ from October:
That post shows the difference in both parsing and execution between these two statements:
$ if false; then [ a == ]; else echo 'NOT PARSED'; fi NOT PARSED
$ if false; then [[ a == ]]; else echo 'NOT PARSED'; fi /bin/bash: line 1: unexpected argument `]]' to conditional binary operator /bin/bash: line 1: syntax error near `;' /bin/bash: line 1: `if false; then [[ a == ]]; else echo 'NOT PARSED'; fi'
Users reported that
Gentoo and Nix both invoke [ without $PATH set, which means that the
coreutils executables /usr/bin/test and /usr/bin/[ won't be found.
Last October, I described the difference between [ and [[.
I originally thought people could use /usr/bin/[.
But Gentoo and Nix both use [ without $PATH set. So I thought: how hard could it be to implement?
[ is just an expression langauge with no lexer. Just replace the lexer and that's it. (Other examples: find, expr)
However, I soon found out that there are fundamental problems with the design of the test builtin.
It is an instance of string confusion.
As a reminder, here is how the builtin works:
$ [ -z "" ]; echo $?; # -z returns 0/true on an empty string 0
$ [ -z foo ]; echo $?; # 1/false on a non-empty string 1
In bash, -a is an alias for -e:
$ [ -a / ]; echo $?; # -a returns 0/true if the path exists 0
$ [ -a /oops ]; echo $?; # 1/false if it doesn't 1
Test body
if (s > t) { # greater than less than. This depends on LOCALE. Maybe change
# it to a function?
}
if (s == t) {
}
Disabled: ! -a -o ( ) < >
a = ' 3 ' # note spaces, we read them from a file b = ' 5 ' test $a -lt $b
if (a < b) # BAD: STRINGS THAT LOOK LIKE NUMBERS # maybe disallow this, too subtle! # I would have to write a comment
if (sortsBefore(a, b)) if (order(a, b)) if (cmp(a, b)) if (cmpLocale(a, b))
if (Int(a) < Int(b)) # THis is OK
mystr='-a' # -a can just be a string myfile='-a' # -a is a valid filename
[ $mystr ] -- Test if A is empty
[ -a ] -> [ $mystr ] "a"
[ -a -a ] -> [ -a $myfile ]
[ -a -a -a ] -> [ $mystr -a $otherstr ] [ -a -a -a ]
[ -a -a -a -a ] -> syntax error!! But this DOES have a representaiton.
exists "a" and "a"
The 4 case POSIX thing isn't enough!
[ ( -a -a )
Also see "Three Meanings of Slash" and #
[ -z ] [ -z -a ] [ -z -a ] ] # another weird lookahead case
Another way to think of it is if there were no difference in Python between the following:
and and and "and" and "and"
equal equal equal equ
Part two:
Otherwise, use [[. It eliminates whole classes of problems.
If you need those, you can use shell's built-in negation, or you can
I don't think this style guideline is very restrictive. In fact I never used
[[ -- I've been writing shell scripts for 10 years. The 2 and 3 argument
versions of test suffice for almost all purposes.
Options for Oil:
Style guideline / Oil:
just use two arg or three args:o
test -f "$path" ->
test -file $path # not quoting, get rid of
test is-file $path test is-dir $path test exists $path test is-pipe $path
alternative if (isFile(path)), if (isDir)
test $path older-than $path && test $path newer-than $path # with auto-complete test $path is-hard-link-to $path # with auto-complete
Because OSH will implement essentially all shell builtins, it is trivial to make Oil compatible. But we don't want the Oil language to be burdened by compatibility -- that's how we ended up with [ -a -a -a] in the first place!
So if you follow our (loose) style guidelines, you'll get the nice translation.
If you don't, you'll get the "compatible translation", with __. The __ is
a visual cue that you could manually rewrite some code to be nicer in Oil.
These are just what I'm thinking; they haven't been implemented yet.
You must follow the style guidelines above. However, I still want to retain the property of automatic conversion. So I'm thinking of having a namesapce for shell builtins in Oil.
Refer to Translating Shell to Oil.
if _ test -a -a -a {
hello
}
_ could be old builtins. It's subtle a sign that something could be "modernized".
Another option would be if eval-sh "test -a -a -a" {} , but this seems too ugly.
Other options:
if $ test -a -a -a -{
}
if $$ test -a -a -a -{
}
if __ test -a -a -a -{
}
while __ read -r foo {
}
Or you could also do: