Status Update and Blog Backlog


Two days before I went on vacation, I described how I transformed the OSH AST into what I call the Lossless Syntax Tree. This was motivated by the requirement to translate shell to Oil (part one, part two).

The post generated quality discussion on Hacker News, Lobsters, and Reddit, which is what I was hoping for. I wanted to "crowdsource" my research into how different language platforms represent code losslessly.

I made a wiki page called Lossless Syntax Tree Pattern to distill the responses, planning to turn it into a blog post. I also drafted a post that showed more examples of the AST versus the LST.

Then I then went on vacation. When got back on Wednesday, full of renewed energy for the project, I directed it at coding instead of blog posts.

That was the right thing to do, but unfortunately it means that the blog is backlogged. Drafts are being neglected and TODOs are piling up.

In this post I'll summarize what I had planned to write about, without making a promise to do so any time soon. Tomorrow I'll talk about the coding tasks that have higher priority.

Leave a comment if you want to see more on any of these topics.

Blog Backlog

In the Blog TODO Stack, I grouped future blog posts into four themes:

  1. Shell: The Good Parts. Features that a modern shell should preserve and extend.
  2. ASDL. A schema language using the model of algebraic data types, which forms the backbone of the interpreter architecture.
  3. The Difficulty of Parsing. General-purpose parsing tools are not suitable for production-quality interpreters and compilers.
  4. Metaprogramming. It's important and widespread.

I managed to knock off two posts: Pretty Printing ASTs with ASDL and The Thinner Waist of the Interpreter, but there are still many loose ends.

It should take three or four posts to wrap up the first two themes. I don't feel as much urgency with the third and fourth themes, since they'll benefit from future experience in implementing Oil.

There are at least three more themes in play. Here's a list of possible posts:

(5) The Lossless Syntax Tree Pattern

(a) Lossless Syntax Tree, Part Two. As mentioned, this draft goes into more detail on the AST vs. Lossless Syntax Tree for OSH.

(b) An Algorithm for Style-Preserving Source Code Translation. The algorithm I used in translating shell to Oil is worth describing.

(c) Lossless Syntax Tree Survey. The docs on the wiki page have a number of important points worth calling out.

One of the best documents is the design doc for Microsoft's Roslyn platform for C# and Visual Basic. Clang is also powerful and mature, but its documentation isn't as good.

(d) Lossless Syntax Tree Conclusions. Make the following arguments:

(6) Shell: The Bad Parts

There have been several posts about parsing problems in shell:

There are an equal number of problems related to execution. A few that come to mind:

Shell is so confusing that experts are wrong about it:

(7) The Oil Language Design

This is the most important theme. I'm writing about the good and bad parts of shell to motivate the design of a new shell language.

It deserves a separate roadmap, but here's what I'm thinking right now:


I've written short blurbs for more than a dozen possible blog post in three themes. The most important theme is #7: the Oil language design.

If you're interested in anything in particular, leave a comment.

In the next post, I'll describe what coding tasks I'm prioritizing over blog posts. The main goal is to attract contributors. If that works, I may have more time for blogging!