Home

Contrived Example Found in the Wild

2016-11-07

A few days ago, in the post Four Slashes and Three Meanings, I took a confusing shell snippet from Aboriginal Linux and added a third meaning of /.

It was contrived to create an even more confusing example, but my point was that parsing shell can be hard for humans, not just computers.

Yesterday, in testing my parser on real code, I found that I didn't need to make this up. The install_host.sh script in Chromium has this line:

ESCAPED_HOST_PATH=${HOST_PATH////\\/}

Even after writing that post, I have a hard time parsing this! It has five slashes and three meanings.

The best way to see this is to run the code rather try to read it:

$ HOST_PATH=one/two/three
> echo ${HOST_PATH////\\/}
one\/two\/three

OK, so what it does is replace / with \/ for JSON escaping (I never understood the purpose of escaping / in JSON, but let's ignore that for now.)

A slightly clearer way to write it is with quotes around the pattern and replacement:

$ HOST_PATH=one/two/three
> echo ${HOST_PATH//'/'/'\/'}
one\/two\/three

Now we have three unquoted slashes. The first and third are the pattern replacement operators, and the second is the option to replace all occurrences rather than just the first. Compare with:

$ HOST_PATH=one/two/three
> echo ${HOST_PATH/'/'/'\/'}
one\/two/three

Only the first / is replaced.

Oil Language

I believe I have a clean way reconciling shell syntax with the syntax of a "real" programming language. It relies on the technique of lexical state (although having fewer lexical states than bash is an important goal.)

I expect it to look like this:

ESCAPED_HOST_PATH = HOST_PATH.replace('/', '\\/')
sed -i -e "s/HOST_PATH/$ESCAPED_HOST_PATH/" "$TARGET_DIR/$HOST_NAME.json"

The first line happens to be like Python or JavaScript, and the second line happens to be valid POSIX shell. I will expand more on this in future posts.