Roadmap #2


I finished a couple things on the last roadmap -- I got debootstrap to parse, and I explained why parsing bash is undecidable. Then I rephrased it in a more useful way: bash can't be statically parsed.

I'm working on fixing the rest of the errors I hit while parsing git's bash code now. There are only a few real errors left in 130K lines. Although I just checked and the repo is from September 2012 (!). After I fix these errors, I will update to the latest version of git.

It's exposing some good errors. In particular, I already know I have to change the arithmetic expression parser from the precedence climbing algorithm, which handles binary operators, to the more general top-down operator precedence parsing, which will handle all the operators that bash has borrowed from C: unary operators, including prefix and postfix ++; the ternary ? : operator; and array indexing like f[1] or f[x+1].

This expression parser is used in six distinct places in bash.

And yesterday I also revised the grammar for the ${} parser based on the fact that real scripts to use named references like ${!foo} and the completely unrelated array keys operator {!foo[@]}. Translating this revision into recursive descent will fix more of the remaining errors.

While I'm working on this, my blog posts will fall in these categories:

I think the strategy of writing a little each day is working. There's still a large amount of unpublished documentation, and the blog is unclogging the pipes a bit.