Backlog: Rough Progress Assessments

2021-12-07

The last post was Backlog: Explaining the Project, and it gave a few answers to Why Isn't Oil Done?

This post continues where we left off, summarizing my #comments giving rough assessments of the project's progress. I want to give readers a sense of what's done and what's planned.

Table of Contents

Context: The Scope of the Project is Changing

Five Parts of the Project (or Seven)

Punting the Interactive Shell

Five Features of the Oil Language (or Seven)

More Ways to Cut Scope

Summary

Appendix: Shell Is Fast

What's the Fastest Way To Get a Page on the Web?

PHP-Like Productivity: Iterating 50 Times a Minute

Context: The Scope of the Project is Changing

The post Four Features That Justify a New Unix Shell in October 2020 described a big milestone.

Toward the end, it said I would cut the scope of the project to get those concrete features delivered quickly in C++. In particular, I would abandon the Oil language for the forseeable future!

Well, I worked until March on the garbage collector, and then I needed a break. So I started working on Oil again, e.g. with Recent Progress on the Oil Language in June.

So the C++ translation has naturally fallen behind (though the number of tests passing is at an all-time high).

But readers often tell me they want to use the Oil language rather than a cleaned-up, compatible language. I'm also excited about these new features. But there's more to do, as you'll see below.

Five Parts of the Project (or Seven)

I wrote on Hacker News that if you only read what shows up there, you may think Oil is more done than it actually is.

To clarify, this response from me breaks down the project into five parts.

There are roughly 5 equal sized parts of project, all large:

The compatible OSH language (mature, NOT buggy, but not fast yet)
The Oil language (this is 2 years old; there are bugs due to the way we reuse CPython)
The interactive shell (punting this to "headless shell")
Semi-automatic translation of the "executable spec" from Python to C++
Documentation

But I realized that there are two more parts to the project:

This blog. It's separate from the documentation, and is essential to the project.
Our own dev tools "lifted" into applications. We use many shell scripts in our own codebase, and the gaps felt there will motivate more tools.

For example, the appendix of the first backlog post mentioned that I want to unify the continuous build and developer build, and make them more incremental, parallel, and reproducible. I made some progress on this over the weekend, playing with alternatives to Docker like podman and bubblewrap. This is pointing in the direction of what I sketched in April.

Punting the Interactive Shell

Let me emphasize that I now think of the interactive shell as outside the project's core. By default, you'll get a limited bash-like experience, and maybe not even that after the C++ translation.

Oil has hooks to make something fish-like, but other people have to make an effort for it to happen.

Here's another comment that clarifies this:

For the forseeable future, Oil won't overlap with fish very much.

...

Oil has the architecture and possibility to be something like fish (a great interactive shell). It has a nascent headless shell mode, e.g. for being driven by GUIs, which no other shell I know of has.

...

And Oil uses a principled and accurate parser for autocompletion. No other POSIX shell does this.

But personally I'm putting that as a low priority in favor of the "batch" use case. I think the "killer use case" will probably be to have a better language for continuous builds and "cloud automation", which on current platforms like Github, Gitlab, and sourcehut use a lot of YAML and shell.

Five Features of the Oil Language (or Seven)

So the Oil language is just one part of the project, but it can also be broken down into many features! I drafted a blog post on Zulip to explain this, and I mentioned it in the long Nix RFC thread to give Nix users a sense of the project.

In the draft, I broke down the language into 5 major features, and gave an estimate of how done each one is:

Python-like expressions on typed data: 9/10. This could be 7/10 or 8 /10 because we still need to divorce the expression evaluator from Python.
Eggex (regular expression syntax): 9/10
Procs: 6/10 (could be 7/10 after recent progress; we need to finish typed arguments)
Ruby-like blocks for DSLs: 3/10. Need eval (myblock), etc.
QTT (TSV-like tables): 3/10. We emit QTT in one place: pp proc, but we don't have a parser. The building block QSN is done!

This is still correct, although we should add more space for:

Pure functions, and a unique form of coprocess: 2/10
Shell builtins, which are like a "standard library": 4/10
- argparse for flag parsing
- describe for a test framework
- I also mentioned the Awk and and dplyr-inspired builtins in the last post.

So if you say I overestimated how done it was, I wouldn't argue with you!

That said, Oil should be useful long before all these things are done. We can frame it in a positive way and say that the language is designed for many more years of evolution :-)

More Ways to Cut Scope

In the first Winter Backlog post, I mentioned that I'm brainstorming ways to expand the project to more contributors.

During a conversation with Raphael Megzari, I repeated that I think hiring a full time, experienced compiler engineer is the most realistic way to move the project forward. I want somebody who can look at the C++ translation problem and say I can do this whole thing, and they should have a big block of time to do it.

I think this is realistic because I've already made oil-native pass more than half the OSH spec tests. I think finding the right person will be harder than funding this position. More on this later.

Summary

I'm trying to cut scope, but the project is still big. These 5 parts are still "on my plate":

OSH Language (nearly done)
Oil Language. This has up to 7 parts, but delivering the first 2 or 3 will make it useful.
Documentation. We use documentation-driven development to make the language better. We have to be able to explain it with a "straight face".
This blog.
Applications like distros and build scripts.

That leaves out the #interactive-shell and oil-native. The former is "punted" from the project, and the latter should be "outsourced".

Again, let me know if you know an interested and experienced compiler engineer, or are yourself that person! I will post concrete details early next year, but regular readers should already have a sense of the task.

In conclusion, I framed the Oil project as having 5-7 parts, including the Oil language, which itself has 5-7 parts. There's no doubt this will change in the future, but this is how I think of things now.

Appendix: Shell Is Fast

Here's some more motivation for the project, i.e. #shell-the-good-parts.

What's the Fastest Way To Get a Page on the Web?

Excerpt:

This is why I use shell, because it's faster, easier (once your learn it), and you're not locked in.
$ echo '<a href="https://news.ycombinator.com/item?id=29253277">question</a>' > question.html
$ scp question.html oilshell.org:oilshell.org/share
Result: https://www.oilshell.org/share/question.html

More comments:

This assumes you know how to use shell, and can set up stuff like ssh-agent, which took me a long time to learn.
- Trivia: ssh-agent configures your shell by literally printing shell code for you to eval! (This is not very safe.)
Commodity shared hosting is underrated. Here's a Zulip thread on #blog-ideas about that.

PHP-Like Productivity: Iterating 50 Times a Minute

Excerpt:

It's nice that they are adding all these new features, but they are hardly what makes it or break it for me.

Here is the one feature that is [lacking in] "modern stacks":

Edit file -> Alt+tab ctrl+R.

Oops, Alt+tab fix

Alt+tab ctrl+R.

When debugging I can do that 50 times a minute. With my react app, I can do it maybe 5 times a minute. With my golang app I'm lucky if I can do it twice a minute.

I agree, and this is why I use shell and Python! My comment further down in the thread:

User interfaces, data science, and security/reverse engineering work are three domains that really require tight feedback loops. For example, the work I did on #parsing-shell is basically a kind of blackbox reverse engineering, and was done with < 100 ms feedback loops.

Related: Web Sites Are Naturally Made With Shell Scripts (February 2020). Another post in #shell-the-good-parts.