Home

Project Roadmap #4

2017-02-28

Yesterday, I reviewed November's Roadmap #3. I'm happy with the progress so far, and I think that this is the point in the project where contributions will accelerate it.

To orient contributors, this post describes what I want work on in the next month or two.

Testing Enhancements

I believe software projects to move faster with automated tests. They should help contributors write code for Oil, and they will help me review code.

As mentioned on the Github page, there are three kinds of tests:

  1. Unit tests written in Python.
  2. Spec tests which run against some subset of OSH, bash, dash, mksh, and zsh. The idea is to figure out the specification for OSH by testing what happens in practice. (I've also read the POSIX spec, and in practice all shells are highly POSIX-compliant.)
  3. "Wild" tests which test the parser against source code found in the wild. These tests are more of a guideline, because we only test that there are no parse errors, as opposed to making assertions on the LST.

I've published the spec test results as HTML. The very next task will be to publish the unit tests and wild tests the same way.

The test coverage is fairly high, and I want to keep it high. Contributors should be able to make pretty aggressive changes to the source code and rely on the tests to catch breakages.

And you should be able to make enhancements without understanding the whole program, even though it has a simple and comprehensible architecture.

Vertical Slice of the Shell Runtime in C++

Surprisingly, nobody has complained that almost all the code is in Python. Even though Python made Google successful, I'm used to hearing C++ or Java programmers question its usage.

In fact the opposite is true: I've been asked why it should be ported to C or C++ at all.

A few reasons:

  1. Most embedded Unix systems don't have Python, and I want Oil to be usable there. For example, Android uses mksh and doesn't have Python.
  2. Python does some non-trivial things with signals, like turning some of them exceptions, which may cause bugs. It also might do some nontrivial stuff in os.fork().
  3. The Python interpreter is slow to start. This is apparent when you run the spec tests, which start OSH many times in a serial fashion.

So I want to explore the C++ software architecture by threading through a command like this:

$ test -d / && echo 'hello world'
hello world

The chain will look roughly like this:

  1. The Lossless Syntax Tree for OSH is defined in a file called osh.asdl.
  2. Both osh.asdl and oil.asdl will compile to a simpler language specified in ovm.asdl. I need to write a slice of the OSH-to-OVM compiler, probably in Python.
  3. The ovm.asdl trees will be serialized in OHeap format.
  4. C++ code will be generated from ovm.asdl.
  5. The OVM nodes will use C++ enums generated from core/id_kind.py (The Backbone of the Interpreter).
  6. To fill out the tree-walking interpreter, I'll adapt the old C++ runtime code I mentioned in the very first blog post . I had shell that would start basic processes, but abandoned it because iterating in C++ takes forever.

Three pieces of relevant C++ code already exist, so hopefully this work will mostly be gluing them together and polishing.

Further Work

Once that is done, there are still more tasks:

two more tasks:

Oil Parser

I wrote a parser for OSH, and a translator from OSH to Oil.

But I still don't have a parser for Oil!

Make a big claim: No Parser Generator or Meta-Language Can Handle the Shell

TODO: Copy points from Hacker News post. lexer modes, lexer hints, pratt parsing, etc.

Same strategy: top-down, lexer modes, etc. two interleaved sublanguages

Bootstrapping

This is sort of an unknown.

TODO: Possibly include bootstrap.md -- notes about bootstrapping.

Use Python AST module. BUT: I want to preserve my comments. Lossless Syntax Tree for Python? RedBaron?

I might suffice to rewrite all the comments, because they're pretty out of date.

Problems with conversion: I didn't use exceptions?

OSH vs. Oil. Project Risk and a Fallback Plan.

USEFUL INTERMEDIATE GOAL:

Maybe its own blog post. What should I do with OSH? What are the carrots for adoption?

OSH is more like a language platform, with a lossless syntax tree:

Most programming languages fail.

If Oil takes longer than expected, or I don't have a lot of contributors, I can polish OSH more.

If people are using OSH, then they will probably contribute.

Also: Oil can serve as an extension language for OSH.

COMBINE THE LANGUAGES.

"source mylib.sh"

"oil-source mylib.oil"

Also function calls:

$[foo(bar,baz)]

$[ escapeHtml(foo, bar) ]

$(( foo(bar,baz) ))

$[ glob('*.py') ]

I don't know how things will play out. We'll see.

It's generally hard to get people to new languages, but converting from OSH to oil is a good start.

If you have feedback, about this plan, leave a comment.

Conclusion

I hope that the automated test enhancements I've described enable pleasant and friction-free contributions.

In theory, I should be able to work on the architecture of the C++ code while others work on the Python code.

The next blog posts will probably point to CONTRIBUTING and TESTING docs that I'll publish in the Git repository.


Discuss this post on Reddit.
Get notified about new posts via @oilshellblog on Twitter.